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XE-900 


XE-800 


XE-700 


EPIC™ XE-900 

1.0 GHz CPU 


N€€d reliabilityP 
-40° to +8B° 


CPU 

Via Eden 

AMD Geode GXI 

STPC 

Clock speed 

400 MHz; 733 MHz; 1.0 GHz 

300 MHz 

133 MHz 

BIOS 

General Software 

Phoenix 

Phoneix 

DRAM support 

to 256 MB 

to 256 MB 

32/64 MB 

Compact/Flash 

Type 1 or II 

Type 1 or II 

Type 1 or II 

COM 1 

RS-232 

RS-232/422/485 

RS-232 

COM 2 

RS-232 

RS-232/422/485 

RS-232/422/485 

COM 3 

RS-232 

NA 

RS-422/485 

COM4 

RS-232 

NA 

RS-232 

COM 5 

RS-232/422/485 

NA 

NA 

COM 6 

RS-422/485/TTL 

NA 

NA 

LPTI 

0 

0 

1 

EIDE 

2 

2 

1 

USB 

2 

6 

2 

CRT 

I600x 1200 

I280x 1024 

I280x 1024 

Flat panel 

LVDS 

yes 

yes 

Digital I/O 

24-bit prog. 

48-bit prog. 

24-bit prog. 

Ethernet 

10/100 Base-T 

Dual 10/100 Base-T 

10/100 Base-T 

Expansion 

PC/104 & P/us 

PC/104 & P/us 

PC/104 

Power 

3.6A operating 

1 .6A max. 

1 .6A max 

Temp, range 

-40° to 70/85° C 

-40° to 80° C 

-40° to 80/85° C 

Shock/vibration 

40/5g 

40/5g 

40/5g 


N€€d Linux, QNX, Window5®P 
Try our OS EMBEDDER™ KITS 


Our kits are the shortest path to 
a successful OS on an Octagon 
embedded computer. 

• Pick your Octagon SBC 

• Pick the OS you prefer: Linux, 
Windows, QNX 

Octagon delivers a high 
performance, total solution. 




Typical Linux kit includes: 

• Target CPU card 

• Preloaded OS image on 256 MB 
industrial CompactFlash 

• 256 MB SO-DIMM module 

• Interface cables 

• Hard copy of manual 

• Mouse 

• CPU OS bootable CD 

• Optimized OS version 

• Full driver support for 
on-board hardware 

• X-Windows support 

• Example applications and 
source code 

• Extra documentation 
















































X-SRAM-2 MB 

• 2 MB high speed, SRAM 

• Read and write at full bus speed 

• Pointers to memory saved if CPU 
resets or loses power 


Need PC/I04 expansionP 
Try oup XBLOKs® 


X-DIO-48 bit programmable 
digital I/O 

• 48 digital I/O, 5V compatible 

• Source and sink 16 mA per output 

• Direct connection to 
opto-module racks 

X-COM-2 dual UART 

• Up to 230.4 kBaud data rate 

• Supports RS-232/422/485 

• RS^85 fault protected to ±60V 

X-LAN-I Ethernet LAN 

• 10/100 Base-T, Intel 82551 ER 


XBLOKs offer the best compromise 
in cost and function for both PC/104 
and PC/104-P/us. Only 44% the size 
of a standard PC/104 card, you can 
add two functions to your system 
but increase the stack height by 
only one level. -40° to 85° C. Heat 
diagram shows enhanced cooling. 
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• Fully plug-n-play 

• High performance, 

PCI bus interface 

X-USB-4 quad USB 2.0 

• Speeds up to 480 mbps 

• Mix and match USB l.l and 2.0 

• Current-limited ports can supply 
500 mA to external devices 


N€€d a fanlEss systemP 
N6W CONDUCTION 

COOLING SYSTEM 



Designed for the XE-900, 
our conduction cooling system 
eliminates a fan even at 1.0 GHz. 


For a full listing of 
Octagon Systems 
products, visit us at 

www.octagonsystems.com 
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SBS Technologies"’ has made AdvancedMC'’ a reality 


SBS knows AdvancedMC. Choose from our large selection of real, 
available products. 


NAME DESCRIPTION AMC.1 AMC.2 


SBS DOESN'T JUST ANNOUNCE 

AdvancedMC Modules, we build them. 
And you're looking at real products. 
For a long time, a modular, open 
telecom standard was just a dream. SBS 
Technologies is making it a reality. 

In less than one year SBS has introduced 
and built more than a dozen AdvancedMCs 
and a number of carriers. We know it can 
be tough to keep up with all the progress 
AMCs are making, so we created the 
AdvancedMC Insider monthly newsletter 
with latest AMC news. To subscribe, go to 
www.advancednricinsider.com. 


TELUM TSPE01 

TELUM ASLP10 

TELUM 624/628-TEJ 

TELUM 1001-012M/S 

Processor AMC module with PowerPC® 7447A processor 

Intel® Pentium® M processor AMC module 

WAN Edge Access I/O modules 4 or 8 port T1/E1/J1 

WAN OC-12 module 

• 

• 

• 

• 

TELUM 1001-03 

WAN OC-3 module 

• 


TELUM 1004-03M/S 

WAN OC-3 module 

• 


TELUM 1001-DE 

WAN DS3/E3 module 

• 


TELUM 1204-03 

WAN intelligent AMC.2 multi-service 4-port OC-3 module 


• 

TELUM GE-QT 

Gigabit Ethernet AMC 4 port NIC 

• 


TELUM FC2312-FF 

Fibre Channel HBA cards (fiber-optic media) 

• 


TELUM FC2312-CC 

Fibre Channel HBA cards (copper media) 

• 


AT-AMC1 

AdvancedTCA® carrier for 2-4 AMC.1 module 

• 


AT-AMC2 

AdvancedTCA® carrier for 2-4 AMC.2 module 


• 

BCT4-AMC1 

IBM® BladeCenter® T carrier for 4 AMC module 

• 

• 

TELUM GPSTC-AMC 

GPS-based clock AMC module 


• 

TELUM 2001-VGA 

AMC VGA module 

• 



Technologies. 


SBS knouis. 

Find the AdvancedMC product you’re looking for at luujiu.sbs.com/amc or call 800.SBS.EMBEDDED 
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AMC modules can be half- or full-height, 
single- or double-wide. These sizes allow 
front-panel insertion into ATCA carrier 
cards. • Pg. 14 



^ Denotes Custom Portion of Design ^ Denotes IP Block 


System-level diagram of an IP-centric 
FPGA design. • Pg, 60 



Slice Architecture Tackles Growing Thermal 
Demands of High Performance • Pg, 72 
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Operate and 
survive under 
the most extreme 
conditions with 
ruggedized E-Disk® i 
soiid-state fiash drives and 
network storage soiutions. 
BiTMiCRO’s cutting-edge 
storage technoiogies offer utmost 
reiiabiiity, optimum data security and 
unmatched performance. 


Ethernet I Fibre Channel I SCSI IIDE/ATA 
USB I FireWire I cPCI VME I SATA I ISCSI 
PCI-X I PCI Express I SAS I Infiniband 


BiTMICRO^ 

ULTIMATE STORAGE SOLUTIONS’" I 


BiTMICRO Networks, Inc. 
45550 Northport Loop E 
Fremont, CA 94538-6481 


^ www.bitmicro.com 
S info@bitmicro.com 
@ 510-743-3475 


MISSION CRITICAL^^ 

data storage modules 



Extreme Comprehensiveness: We offer the most comprehensive VME/cPCI 
storage product line in the world, offering device alternatives for 
any standard or unique application. 

• Solid State Disk • Removable Hard Disk 
• Tape Drives • Optical Disk • PCMCIA Adapter 
Extreme Performance: Our VME products feature extreme speed, capacity and 
ruggedly reliability with 320 MB/sec throughput enabled by 
LVD SCSI technology, storage capacity of more than 600 GBs 
per module and a 1,400,000 hour MTBF. 

Extreme Quality: Phoenix International is the only 
manufacturer of VME data storage products that is 
ISO 9001:2000 Certified. 
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AdvancedTCA • CompactPCI • PCI/ISA • Embedded PCI-X • ETX • COM Express • Custom Design and Manufacturing 




more at www.dtims.pom/rtc 


AdvancedTCA 


(^GLjOBALjCOMM'2006 

0aolh 


#faiv 



Nodes and Switches 


Three Decades of Embedded Solutions 


• DTI Board and System Level Products 

- AdvancedTCA 

- CompactPCI 

- COM Express and ETX 

- PCI/ISA and Embedded PCI-X 

- Custom Offerings 


• DTI Technical Presentations will be offered 
hourly at GLOBALCOMM - Booth 17051 

- AdvancedTCA Gotchas 

- AdvancedTCA Fabrics 

- COM Express Overview 


• Engineering and Manufacturing 
are under-one-roof, leading 
to a reduced time-to-market. 


U 'Diversified 
Technoiogy® 

An Ergon Co. 


CPU Boards - Switch Blades - Rackmount Systems - Modular Platforms 

Computer Modules - System Integration - Custom Designs - Outsourcing Capabilities 

1.800.443.2667 • sales@dtinns.conn • http://www.dtims.com/rtc 


All trademarks and tradenames are the property of their respective owners. 













GE Fanuc Automation 



Embedded Performance. 


Looking for on embedded computing solution that gives you a 
tremendous advantage over your competition? Look no further 
than GE Fanuc Embedded Systems. 

Featuring a comprehensive offering that includes Intel-based 
SBCs and complete I/O systems, industry-leading communications 
technology, rugged flat panel monitors and computers and more. 



ATCA-7820 

AdvancedTCA Dual Core Intel Xeon 
Processor LV 2.0 GHz Processor Node Board 

• Combines two dual core processors, 
providing four high performance cores 

• PICMG 3.0/3.1 compliant 

• Processor speeds up to 2.0 GHz 

• Up to 8 GB DDR-2 SDRAM with ECC 

• AMC.l compliant site (PCI-Express x 8) 

• Single PCI-X PMC site at 64-bit/66 MHz 

• Single 10/100 Ethernet interface 

• Dual serial ports 

• Four USB 2.0 ports 

• Serial ATA interface 

• Optional 2.5-inch IDE hard disk drive 

• OS support for Windows XR Windows 
2000, and Carrier Grade Linux 



CPCI-7808 

Intel Pentium M CompactPCI 
Single Board Computer 

• PICMG 2.16/2.9 compliant 

• Processor speeds up to 1.8 GHz 

• Up to 2 GB DDR SDRAM 

• Dual PMC sites 

- 64-bit/66 MHz site 
-32-bit/33 MHz site 

• Dual 10/100/1000 Ethernet interface 

• Dual 16550-compatible serial ports 

• Three USB 2.0 ports 

• Serial ATA interface 

• Up to 1 GB CompactFlash 

• OS support for Windows XR Windows 
2000, ONX, Linux, and VxWorks 



FANUC 


Embedded Systems 


GE Fanuc Embedded Systems con support your full range of 
embedded computing needs to solve your greatest challenges. 
From standard product requests to a solution that is quickly and 
fully customized to your specific application, GE Fanuc Embedded 
Systems has the breadth, depth and support capabilities to provide 
a serious boost to your performance. 

Learn more atwww.gefanuc.com/embedded 



CP920 

CompactPCI Managed 
Gigabit Ethernet Switch 

• PICMG® 2.16 compliant 

• Layer 2/3/4 switching 

• Twenty-four 10/100/1000 Ethernet ports 

• PICMG® 2.9 Rev 1.5 IPMI compliant 

• PICMG® 2.1 Rev 2.0 hot swap compliant 

• 802.1p, 802.10 VLAN, deep packet filter¬ 
ing, link aggregation. Rapid Spanning 
Tree (802. Iw, 802. Id), broadcast storm 
control, port mirroring 

• Conduction cooled model available 

- Twelve 10/100/1000 Ethernet ports 



PMC-0247 

Serial ATA Hard Disk Drive Module 

• 40 Gbyte or 80 Gbyte options available 

• Support for SATA I (150 Mbps) and 
SATA II (300 Mbps) interfaces 

• Support for programmable External 
Flash for BIOS expansion 

• Supports 32/64-bit, 133 MHz maximum 
PCI-X interface 

• Fast read/write performance 

• VITA 39 compliant 

• OS support for Windows XR Windows 
2000, Red Hat Linux, and Enterprise 4.0 


©2006 GE Fanuc Automation. All rights reserved. 
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FPGAs: 

The New Matrix for Design 


by Tom Williams, Editor-in-Chief 

T he general dictionary definition of a matrix calls it, “a situa¬ 
tion or surrounding substance within which something else 
originates, develops, or is contained.” No, it is not a com¬ 
puter-generated virtual reality fed into the brains of comatose hu¬ 
mans producing power for space aliens. Despite that, the defini¬ 
tion ideally fits the characteristics of today’s programmable logic 
devices. They are a “surrounding substance” of gates from which 
developers can now evoke the functionality of entire systems. As 
a result of advances in speed, density and limited degrees of spe¬ 
cialization, FPGAs today represent a landscape that is as yet not 
fully explored. 

Within this issue, RTC is proud to bring you a special supple¬ 
mentary section on this “New Matrix for Design,” a technology 
that is assuming roles that previously were filled by coprocessors, 
CPUs, communications processors, DSPs, ASICs and, intrigu- 
ingly enough, software. From their humble beginnings as “glue 
logic,” programmable logic devices have grown to include both 
soft and hard processor cores and the ability to incorporate both 
hardware and software functionality in the same device. Not only 
that, they have the ability to be reconfigured and updated through 
downloading new code while still installed in their system. Some 
today even have the capability of being partially reconfigured in 
a running system. 

In a world that lives and dies by the whims of the twin de¬ 
mons, Cost and Time-to-Market, and their evil sister. Perfor¬ 
mance, FPGAs are helping developers move their ideas more 
quickly into functioning reality. On one level, they simply are 
used as a central part of a final design, carrying IP cores, custom 
logic and software functions defined by the developer. On an¬ 
other level, they are used as prototypes to prove a design before it 
is committed to silicon, and on yet another they serve as the step 
to implementation of a structured ASIC, which can enhance the 
bottom line with smaller size, power consumption and, above all, 
cost as volumes ramp up. ASICs have increasingly been moved 
to those rarified realms where only the highest volume produc¬ 


tion can justify the expense and risk of the increasingly complex 
development cycle. 

FPGAs, when used as what might be termed “silicon soft¬ 
ware engines,” can accomplish what no von Neuman architecture 
can achieve, and that is truly parallel execution. Without multiple 
processors or cores, normal CPUs have to “fake it” with inter¬ 
rupts, context switches and deadline scheduling. FPGAs make it 
look relatively easy. 

Another interesting development has to do with hardware 
inventory. A manufacturer can now design a single board-level 
hardware platform consisting of perhaps a CPU and a set of 
FPGAs, and by incorporating different IP into the FPGAs, es¬ 
sentially offer completely different products to the customer that 
actually have the exact same hardware in common. 

Of course, it’s never all roses. We are still getting our bear¬ 
ings on how to fully utilize the potential of the FPGA and much 
is still to be learned—through experience. For one thing, the de¬ 
velopment tools are still, well, developing. We have come a long 
way since the days when an engineer entered a net list into a com¬ 
puter, and even since FPGAs were primarily programmed using 
a hardware description language or RTL. Now we are in an era 
when developers can define functionality using graphical tools or 
even familiar software tools, such as in a “C-to-gates” scenario. 

Debates rage as to which approaches are best and that is a 
good thing. Yet the FPGA has made steady progress from lowly 
“glue” to important IC replacement to coprocessor and on to 
main processing element, and it is now in some cases the whole 
“shebang” in many applications. That is not to say that CPUs and 
DSPs are being replaced—far from it. In a vast number of cases 
the technologies complement one another. Still, it is exciting to 
see this versatile and powerful technology coming into its own, 
and it will be even more intriguing to watch as engineers and 
developers discover ever more innovative ways that it can be put 
to work solving the problems of high-end systems, d 
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Software Compilation to FPGA Coprocessor Enables 
Supercomputing Adoption 

Celoxica has announced software programming support for the SGI RASC RClOO blade 
from Silicon Graphics. Based on SGI’s Reconfigurable Application-Specific Computing (RASC) 
technology, the RClOO computation blade packs the power of dozens of supercomputer 
nodes into a single blade by leveraging the parallelism of dual Xilinx Virtex 4 FPGAs. Using 
Celoxica’s DK Design Suite and libraries, the RClOO blades can be programmed directly by 
the end user to accelerate custom C software algorithms, overcoming the traditional barrier 
to FPGA-based computing. 

The Celoxica environment speeds and simplifies the programming of FPGA devices from 
C-language software, enabling the acceleration of many high-performance computing (FIPC) 
applications by orders of magnitude over conventional systems. Celoxica’s library for the 
RASC RClOO provides C-language calls to access all the SGI core functions, registers, memo¬ 
ries and debug resources. The DK Design Suite compiles these software functions and user 
algorithms directly to custom hardware to take full advantage of the parallelism of the FPGA 
devices and the performance of the SGI architecture. Celoxica and SGI have also signed a 
resale agreement making the Celoxica programming environment available through the SGI 
sales force. 


New PICMG 

Subcommittee to Enhance 
ATCA Interoperability 

The PICMG Executive 
members have approved the 
formation of the Requirements 
Engineering Subcommittee 
(RES), which is chartered to 
summarize and categorize the 
mandatory and optional sections of 
AdvancedTCA and AdvancedMC 
specifications to make it easier for 
users and vendors to implement 
commercial off-the-shelf systems 
using these specifications. 

The AdvancedTCA 

specifications are powerful 
and comprehensive, but they 
also include options for several 
different market segments and 
different network elements within 
the Telecom market segment. 
The main goal of the RES is 
to develop a framework for 
industry groups to define vertical 
market profiles and provide 
recommended requirement sets 
for each specification. 

PICMG does not define 
profile preference or compliance 
or certification requirements, 
so the RES will concentrate 


on the framework that other 
organizations can use. RES will 
also provide valuable feedback to 
the other PICMG subcommittees 
when additional requirements 
or clarifications are needed. The 
formation of the RES comes as 
the take-up of AdvancedTCA is 
accelerating and the interests of 
the user groups are high. 

The initial work of the RES 
will address the needs of the 
Telecom market, but other market 
segments such as Enterprise 
Networks, Military, Industrial 
Control and Medical can follow. 
RES was initially formed by 
128 representatives from 65 
companies and organizations. 



More have joined since. Details 
of all PICMG activities and 
information on its specifications 
can be found at the PICMG Web 
site at www.picmg.org 


HyperTransport 
Consortium Announces 
New 3.0 Specification 

The HyperTransport 

Technology Consortium has 
released version 3.0 of the 
HyperTransport specification. 
The new standard nearly doubles 
the bandwidth and speed of 
the previous HyperTransport 
2.0 specification. In addition, 
HyperTransport 3.0 supports 
a variety of new features 
including AC coupling mode, 
hot plugging, un-ganging mode 
and dynamic power management 
for the support of extended 
signal transmission distance, 
typical of backplane and chassis- 
to-chassis implementations. 
HyperTransport 3.0 builds on the 
existing HyperTransport 1.0 and 
2.0 standards. HyperTransport 
3.0 is fully backward compatible 
with earlier versions of the 
HyperTransport specification 
standard. 

“The added performance and 


new features of HyperTransport 
3.0 extend the applicability of 
HyperTransport technology 
from chip-to-chip and board-to- 
board, all the way to chassis-to- 
chassis applications,” said Mario 
Cavalli, general manager of the 
HyperTransport Technology 
Consortium. “HyperTransport 
has proven to be the industry’s 
most flexible, powerful and future- 
ready standard interconnect 
solution for compute-intensive 
system designs, delivering a 
winning combination of high- 
performance, standardization and 
optimized total cost of ownership 
(TCO) for data center and 
supercomputing applications.” 

HyperTransport 3.0 

extends the 1.4 GHz dual data 
rate (DDR) maximum clock 
of HyperTransport 2.0 to 1.8 
GHz, 2.0 GHz, 2.4 GHz and 2.6 
GHz, and delivers a maximum 
aggregate bandwidth of 41.6 
Gbits/s—a bandwidth increase of 
86 percent over HyperTransport 
2 . 0 . 

Solarflare Communications 
and Level 5 Networks 
Agree to Merge 

Solarflare Communications 
and Level 5 Networks, two 
leading early-stage Ethernet 
companies, have signed a 
definitive agreement to merge and 
raise new funding, increasing the 
cash balance of the new company 
to over $50 M. The merger was 
driven by the common vision 
shared by the two companies 
to deliver high-performance, 
cost-effective, standards-based 
Ethernet products that enable 
one network for compute, storage 
and network traffic. The merged 
company will be called Solarflare 
Communications and will 
retain its headquarters in Irvine, 



Get Connected with companies mentioned in this article. 
www.rtcmagazine.com/getconnected 
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Discover how Dan and the QNX team can sharpen 
your competitive edge. Download your free product 
evaluation from www.qnx.com/innovate. 

© 2006 QNX Software Systems GmbH & Co. KG, a Harman International Company. All rights reserved. QNX and Neutrino are trademarks of 
QNX Software Systems GmbH & Co. KG, registered in certain jurisdictions and are used under license. All other trademarks and trade names 
belong to their respective owners. 301813 MC339.13 


“QNX has consistently ■ 
defined the leading edge 
of RTOS technology.” 7^ 


Dan Dodge. QNX CEO & CTO. 

OS architect and father of embedded computing. 


Slash your debugging time by weeks, even months, with the 
QNX® Neutrino® RTOS, the most innovative operating system 
on the market today. Unlike conventional OSs, QNX Neutrino 
runs all applications and system services — even device 
drivers — as memory-protected components. So you can 
detect memory violations immediately. And focus on what 
really counts: building innovative features, faster. 

Combine this with performance rated #1 in the RTOS market 
and reliability proven in millions of installations, and you 
have the platform to power your own leading-edge design. 




Cut your development time. 
Build longer-lasting products. 
Gain maximum performance. 

Only the QNX Neutrino RTOS gives you: 


QNX SOFTWARE SYSTEMS 


Optimized support for ARM®, MIPS®, PowerPC®, 
SH-4, XScale®, and x86 processors 


Preintegrated stacks for IPv4, IPv6, IPsec, 
SNMP, SSH, SCTP, TIPC, IP Filtering and NAT 


Royalty-free kits for multimedia, flash file 
systems, 3D/2D graphics, web browsers, etc. 


Unparalleled support for open standards: 
POSIX, Eclipse™, OpenGL® ES, RapidIO® 


Focus on Innovation, not Debugging 


Adaptive partitioning to contain security 
threats and guarantee realtime behavior 


Multi-core, multi-processing support for 
the ultimate in scalability and performance 


In the QNX Neutrino RTOS, device drivers, file systems, and 
protocol stacks all run outside of the kernel, as memory-protected 
processes. This architecture virtually eliminates memory corruptions, 
mysterious lockups, and system resets. Achieve maximum reliability 
and put an end to endless debug sessions. 


Microkernel 


Memory Protected 1 



TCP/IP 

1 Web Browser 1 







Message-Passing Bus 







1 Media Piayer 1 

HTTP Server 1 

USB Device 1 
1 Driver J 
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California, with development 
centers in Sunnyvale, California 
and Cambridge, UK. Russell 
Stern, Solarflare president and 
CEO, will head the combined 
company. 

Solarflare is a developer of 
lOGBASE-T chips, which enable 
simple, relatively inexpensive 10 
Gbit interconnects that leverage 
the installed base of twisted¬ 
pair copper cabling and Ethernet 
switching infrastructure. Level 
5’s 1 Gbit/10 Gbit Ethernet 

controller technology is consistent 
with the Ethernet paradigm. 
By the simple installation of its 
accelerated Ethernet controllers, 
server system performance is 
increased with no need to change 
existing applications or to deploy 
new protocols. 

Through this merger, 10 
Gbit Ethernet is expected to 
deliver on its promise to be one 
ubiquitous network. Ethernet has 
evolved from local area network 
(LAN) to wireless (WiEi) to 
wide area network (WAN), and 
most recently to metro Ethernet 
services. To meet the growing 
demand for speed, Ethernet has 
gone from 10 Mbits/s to 100 
Mbits/s to 1 Gbit/s and is now 
moving inexorably up to 10 Gbits/ 
s, on its way in future years to 100 
Gbits/s. 

Aonix and LynuxWorks Ink 
Strategic Alliance 

Aonix, a provider of solutions 
for safety- and mission-critical 
applications, has announced a 
strategic agreement in which 
LynuxWorks will market and 
distribute Aonix technology 
alongside its LynxOS product 
line. The final agreement will give 



Aonix’ 


LYNUXWORKS" 


LynuxWorks’ Java and Ada 
customers access to Aonix 
embedded solutions integrated 
with the LynxOS product line. 
This agreement has immediate 
benefit because both Aonix 
ObjectAda and PERC product 
lines offer out-of-the-box support 
for LynuxWorks technologies. 

Aonix PERC technology 
supports LynuxWorks LynxOS 
and LynxOS-178. The newest 
PERC products—PERC Ultra and 
PERC Pico—complete an end-to- 
end solution for embedded Java 
development. PERC Ultra, with its 
compatibility with the broad range 
of off-the-shelf Java Standard 
Edition libraries and components, 
addresses the needs of complex 
embedded systems. PERC Ultra 
is widely used in applications 
such as network infrastructure, 
telematics, command and control 
and office automation. PERC 
Pico, with its tight execution time 
and footprint constraints as well 
as its low-level device access, is 
designed for deeply embedded 
applications. PERC Pico is 
typically preferred for device 
control, sensor management 
and safety-critical applications. 
PERC Ultra and PERC Pico run 
on LynxOS, either separately or 
cooperatively. 

The ObjectAda 8.2 Linux 
and Solaris development platform 
targeting the PowerPC/LynxOS 
embedded platforms, both 
released independently from 
Aonix in Eebruary this year, 
include an Ada 95 compiler, Ada 
95 optimizer, partial annex C 
support, partial annex D support, 
syntactic editor and both graphical 
and command line interfaces. 

Kuka Controls Announces 
Industrial Platform 
Partnership with Wind 
River 

Kuka Controls has 
announced today its designation 
as a Wind River Platform Partner. 
Partnership status identifies 
Kuka Controls as a provider of 
device software optimization 
(DSO) solutions for real-time 
applications that integrate with 


Wind River products. Wind River 
selectively reviews and appoints 
those companies whose solutions 
are proven compatible with Wind 
River products, provide technical 
advantages to the end user and 
are supported effectively in the 
worldwide market. 

Kuka Controls’ software- 
only technology enables the 



coexistence of Microsoft 
Windows XP and a Real-Time 
Operating System (RTOS) on a 
single computer system without 
inhibiting the functionality or 
performance of either operating 
system and without modifying 
either OS. The determinism and 
response of the RTOS is preserved. 
This coexistence increases 
overall system performance and 
reliability while reducing power 
and space requirements. In many 
applications, the Kuka Controls’ 
extensions eliminate the cost of 
additional graphics/visualization 
hardware. Kuka Controls’ 
VxWin product adds the high- 
end graphical interface and 
connectivity, including printing, 
of Windows XP to Wind River 
VxWorks. Additionally, VxWin 
adds real-time ability to Windows 
XP utilizing VxWorks. The end 
user also gains the benefit of the 
many Windows applications that 
can be used along with the real¬ 
time system. 
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AMC as System Solution 


AMC Offers System Design 
Flexibility 


Developed as an enhancement to ATCA, the AIVIC standard has gone 
beyond its original function of lowering system costs. Its combination 
of performance and size allow it to be used, with the right framework, 
to develop powerful yet compact systems. 


by Todd Wynia 

Artesyn Communication Products 


A few years ago, at the beginning of the 
new millennium, it was becoming 
clear that the popular CompactPCI 
framework could not address all telecom¬ 
munications applications. As telecom¬ 
munications moved from circuit-switched 


to packet-switched technologies and as 
both networking speed and demand grew, 
CompactPCI was beginning to run out of 
room for some applications. Cards were 
not large enough to contain the number 
of channels required in network server 


blades used in the central office, nor did 
they offer enough power density to handle 
the needed circuits. 

In conjunction with telco operators, 
PICMG defined the new AdvancedTCA 
(ATCA) platform for next-generation 
telecommunications system designs. The 
specification, adopted in 2003, offers 
many features that address the needs of 
modern telecommunications systems. It 
uses a high-performance, switched-fab- 
ric serial backplane, operating at speeds 
of up to 10 Gigabits/s per link. The fabric 
provides a natural interface between the 
equipment and today’s packet-switched 
networks. 

AdvancedTCA also incorporates 
the elements necessary to ensure high- 
reliability system designs. This includes 
built-in redundancy in essential system 
structures such as backplane channels and 
system management. It also includes a na¬ 
tive ability to allow hot-swapping of cir¬ 
cuit cards so that the system can be main¬ 
tained without shutting it down. In addi- 
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tion, ATCA defines a system management 
function and a serial Integrated Peripheral 
Management Interface (IPMI) that allows 
the system management software to con¬ 
figure boards, gather status information 
and remotely shut down boards that fail. 

The ATCA specification has several 
key features. One is that the serial back¬ 
plane interface is protocol-agnostic. It can 
support such packet-oriented protocols as 
Ethernet, InfiniBand, PCI Express and 
RapidIO, giving developers consider¬ 
able design flexibility. Another is that it 
utilizes relatively large 8U cards with a 
power budget as great as 200W per card. 

To provide the ability for hot-swap¬ 
ping, the IPMI interface gives the system 
management board full control over other 
system boards. A newly installed board 
powers and initializes only the IPMI in¬ 
terface so that system buses are not dis¬ 
turbed. The system manager then queries 
the board to determine its functions. Only 
if the system manager gives the command 
will the board power up the rest of its 
circuitry and come on-line. If the board 
should fail in some way, the system man¬ 
ager can command the board to go off¬ 
line and power down, preventing it from 
disturbing the system while it is being re¬ 
moved and replaced. 

AMC Improves on ATCA 

The original ATCA definition has 
all the key features needed for next-gen¬ 
eration telecommunications systems, but 
improvements are always possible. One 
idea that quickly arose was a means for 
reducing the cost of boards in the system. 
PICMG members soon realized that de¬ 
fining an appropriate mezzanine module 
would allow the development of standard¬ 
ized modules for key system functions, 
allowing multiple vendors to offer inter¬ 
changeable modules and thus lowering 
their cost. 

The adoption of a mezzanine card 
standard also provides finer granularity 
in the development, inventory and main¬ 
tenance of systems. A carrier card and 
four different modules allow the creation 
of 24 card variations with only five items 
to stock. This reduces inventory costs 


AMC Module 
Configuration 

Height 

(mm) 

Width 

(mm) 

Length 

(mm) 

Power Budget 
(watts) 

Half-height, 

Single-width 

8.18 

73.5 

180.6 

20 

Half-height, 

Double-width 

8.18 

148.5 

180.6 

40 

Full-height, 

Single-width 

14.01 

73.5 

180.6 

60 

Full-height, 

Double-width 

14.01 

148.5 

180.6 

60 


Table 1 


AMC module sizes and power budgets. 


for both vendor and customer. Eurther, 
it widens the market for the five designs, 
lowering both development and produc¬ 
tion costs. 

This finer granularity also benefits 
system maintenance. When the IPMI con¬ 
troller can access and control individual 
modules and the modules are hot-swap¬ 
pable, the system can support hot-swap 
replacement at the module level instead 
of the card level. This reduces the cost of 
component failure. 

PICMG released the Advanced Mez¬ 
zanine Card (AMC) specification in 2005 
to provide this finer granularity. These 
modules incorporate the same IPMI in¬ 
terface and control features as the full 
ATCA cards. They also feature several 
attributes that contribute to substantial 
performance. 

One of the performance-enhancing 
features of the AMC definition is the card 
size and power budget. The cards come in 
four different configurations: half-height 
and full-height in half- and full-width sizes 
(Eigure 1). The different heights allow the 
development of smaller modules contain¬ 
ing only components, or larger modules 
that also have room for front-panel I/O 
connectors and mechanical units, such as 
fans and disk drives. The range of module 
sizes and power budgets (Table 1) allow 
room for implementing substantial func¬ 
tions and enough power to run them. 

Advanced Mezzanine Card modules 
are designed to allow front-panel insertion 
into the carrier card, simplifying their re¬ 
placement in an active system. This also 


allows them to offer front-panel I/O ports 
as the design requires. The system inter¬ 
face at the rear connector can include as 
many as 21 high-speed serial channels. 
As with the ATCA backplane, the system 
interface of AMC modules is protocol-ag¬ 
nostic. 


Versatility Achieved 

The AMC module specification thus 
results in considerable versatility. Modules 
that carry fully loaded general-purpose 
processors, DSPs or baseband network 
processors, including a full complement 
of memory and peripherals, are achiev¬ 
able within the power budget and size 
envelope of AMC modules. Other achiev¬ 
able functions include EPGA-based hard¬ 
ware accelerators, hard disk drives, clock 
generators and drivers, and intelligent I/O 
that runs its own protocol stack. Simple 
functions, such as ordinary parallel I/O, 
and high-speed functions, such as an El/ 
T1 interface, can be put on a module and 
given plenty of front-panel connections. 


System 

AMC Modules 
Used 

Media Server 

General-purpose CPU 

Digital Signal Processor 
Hard Disk Drives 

Signaling Gateway 

General-purpose CPU 
Intelligent 1/0 Cluster 

Media Gateway 

General-purpose CPU 

Digital Signal Processor 
Intelligent 1/0 Cluster 


Table 2 


Building multiple systems with 
AMC modules. 
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Even radio modules and RF amplifiers 
can be readily implemented. 

This design versatility, coupled with 
the fiexible rear-connector serial interface 
and the IPMI interface, allows an AMC 
module to function almost as a miniature 
ATCA card in its own right. This capa¬ 
bility did not escape the notice of PICMG 
and the industry, and efforts were soon 
underway to define a structure that would 
allow use of AMC modules in a system 
without a carrier card. Ratification of the 
resulting definition, MicroTCA, is ex¬ 
pected during 2006. 

MicroTCA is essentially a backplane 
and chassis that serves as a virtual ATCA 
carrier card for AMC modules (Figure 
2). The backplane supports star, dual-star 
and full mesh configurations. This struc- 


carrier hub provides these services to the 
chassis. 

System Design with AMC 

MicroTCA is only a beginning. As 
long as AMC modules receive the neces¬ 
sary clock and management services, they 
can be used in any physical configuration 
that can contain them. Form-factors such 
as a MicroTCA cube are already under de¬ 
velopment, for instance, and even smaller 
“nano-systems” are possible. Further, 
the serial backplane is readily extended 
through cabling, allowing the concatena¬ 
tion of cubes into chains or even wrapping 
a system around a pole in a non-standard 
form-factor, without requiring alteration of 
either the modules or the logical structure. 

Choosing among the ATCA, Mi- 



Figure 2 


The MicroTCA form-factor utilizes the versatility of the AMC module, 
dispensing with carrier cards and plugging the modules directly into a 
backplane. 


ture allows the creation of designs that 
are lower in cost and more compact than 
ATCA systems while retaining signifi¬ 
cant performance. Together, ATCA and 
MicroTCA offer a wide range of options 
for trading off among cost, physical size, 
performance and capacity while employ¬ 
ing the same AMC modules. 

In order to act as a virtual ATCA car¬ 
rier, the MicroTCA system must provide 
modules with the same functions offered 
by a carrier. These include the switched 
fabric for the rear connector interface, sys¬ 
tem clock and clock distribution and shelf 
management using IPMI. The MicroTCA 


croTCA, nano-system or a custom form- 
factor is mostly a matter of trading off 
between size and the benefits that come 
with greater physical capacity. Using 
AMC cards is relatively straightforward. 
A system manager is required to run the 
communications protocols and switching 
functions of the backplane, run the IPMI 
interface and handle any high-availability 
software the system requires. Since AMC 
modules are not system-aware, external 
control is needed to handle system errors, 
failover and hot-swap functions. Beyond 
that, system development is more a matter 
of collecting the right set of modules. 


Only a few AMC modules in combi¬ 
nation are required to provide the basis of 
many systems (Table 2). A media server, 
for instance, might need only a CPU mod¬ 
ule for configuration and control, a DSP 
module for decoding the media streams 
and a hard drive module for holding the 
media. A signaling gateway might con¬ 
tain a CPU along with an intelligent I/O 
module. Add a DSP for encoding and de¬ 
coding, and the design becomes a media 
gateway. 

Although born out of a desire to re¬ 
duce costs in ATCA systems, the AMC 
standard is proving to have tremendous 
versatility. The modules can support high- 
performance functions, have the essential 
elements of hot-swap capability and can 
operate as stand-alone cards, as well as a 
component of a larger card. With the right 
framework, AMC modules can form the 
basis of systems covering a full range of 
telecommunications applications, d 

Artesyn Communication Products 
Madison, WI. 
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AMC as System Solution 


Converging Technologies 
Deliver Next-Generation 
Embedded Solutions 

Originally conceived as a companion technology for ATCA, the 
AdvancedlVIC form-factor is proving to be a versatile building block 
for systems architectures that combine a variety of new and legacy 
technologies. 

by Robert Persons and William Coffey 
Motorola 


M any industries that have tradition¬ 
ally relied on VME are driving 
to transform and modernize their 
systems to take advantage of new proces¬ 
sor architectures and fabric-based system 
architectures as quickly as possible. This 


poses a challenge because system integra¬ 
tors and OEMs would like to make incre¬ 
mental changes to existing systems and not 
have to redesign all elements. They would 
like to reuse I/O that has traditionally been 
supported on VME, while updating the 


Advanced TCA 
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Figure 1 


MicroTCA-based connnnunications servers can leverage the same POSIX- 
compliant operating systems and management middleware from ATCA. 

The Netplane suite of software from Motorola provides Service Availability 
Forum (SAF)-compliant high-availability services independent of the 
underlying hardware or target application. 


compute control core to new technologies 
and have them work together. Rather than 
selecting a single standard, the optimal 
solution will be provided through support 
by both VME and MicroTCA platforms 
for common, published, open software 
standards in terms of operating systems 
and middleware. It is this convergence of 
existing and new technologies that will be 
critical to the modernization of high-end 
systems. 

MicroTCA leverages the emerging eco¬ 
system of the Advanced Mezzanine Card 
(AdvancedMC) to create a new, flexible, 
small form-factor platform. It is important 
that system manufacturers and integrators 
understand the capabilities that MicroTCA 
brings to the embedded market. 

AdvancedMCs were originally de¬ 
veloped for AdvancedTCA (ATCA) plat¬ 
forms. ATCA has been targeted at the 
“aggregation” layer of the great “telecom¬ 
munications network onion,” in between 
the outer edge “access” devices and in¬ 
ner “core” switching functions. Here, 
the priority is placed on a high compute 
capacity per blade, high availability, high 
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bandwidth fabric between the compute 
blades and a few high-speed, high-density 
network I/O interfaces. And, given the in¬ 
vestment in ATCA chassis and blades, it 
is important to have an ecosystem where 
many different solutions can be built 
around the same fabric (and thus the same 
or compatible blades). 

AdvancedMCs bring further econo¬ 
mies of scale to ATCA in the form of a 
more granular approach to the addition of 
either specialized processing elements or 


I/O interfaces. Adding from one to eight 
AdvancedMCs to an existing ATCA pro¬ 
cessing, I/O or dedicated carrier blade 
incrementally extends the solutions pos¬ 
sible from a single ATCA chassis, both 
horizontally toward the network edge and 
vertically within the aggregation layer. 
This leverages the investment in ATCA 
into wider telco applications. 

However, there are still environ¬ 
ments where ATCA is simply either too 
large or too expensive to serve as a base 



Lockheed Martin's 


At the end of the day, mission success relies on the ability Sea SLICE 

to respond to changes in an instant. Designers of mission- 

critical combat systems count on Real-Time Innovations 

(RTI) for fast and reliable standards-based software to 

communicate real-time data over a network. RTFs NDDS 

middleware supports the DDS standard and is widely used 

in military and aerospace applications today. 


Interested in learning how NDDS can work in your applica¬ 
tion? Download Using the DDS standard for High-Reliability 
Applications, free at www.rti.com/rtc06 



architecture, or where -48 VDC is not 
an option. Thus, the idea of MicroTCA 
as an architecture was born (around the 
same time AdvancedMCs were first con¬ 
ceived), where the same pay-as-you-grow 
scalability could be realized in a much 
smaller, self-contained carrier assembly, 
and where, to a certain extent, the chassis 
form-factors (and applications) are limit¬ 
less. Here, a typically passive backplane 
carrier can be mated with the appropriate 
type and number of fabric “hub” modules 
to provide the bandwidth, protocol and re¬ 
dundancy required for the application. 

Six AdvancedMC form-factors of 
varying component height and module 
width have been specified, all leveraging 
the same high-speed, 170-pin edge con¬ 
nector. The MicroTCA specification al¬ 
lows for modular or monolithic chassis 
configurations from one carrier and one 
module to 16 carriers and 192 modules, 
while ensuring that modules always see 
the same “virtual” environment. These 
MicroTCA communications servers typi¬ 
cally support two to three independent 
fabric interconnects on a carrier, where 
each fabric “port” (differential transmit/ 
receive pair) is capable of up to 6.25 Gbits/s 
in each direction, and specific ports can 
be aggregated to form “fat pipes” with 
higher throughput. 

PICMG has defined AdvancedMC 
fabric interconnect standards based on 
Gigabit Ethernet/10 Gigabit Ethernet 
(GigE/lOGigE; AMC.2), PCI Express 
(AMC.l) and Serial RapidIO (AMC.4), 
along with defining storage interconnec¬ 
tions based on SATA/SAS or Eibre Chan¬ 
nel (AMC.3). The MicroTCA specifica¬ 
tion leverages these standards, allowing 
for switched and/or point-point fabrics. 
Current MicroTCA fabrics typically range 
from 1 Gbit/s (one port) to 12.5 Gbits/s 
(4 ports). The current aggregate carrier 
(switched backplane) bandwidth is around 
40 Gbits/s, but next-generation hubs should 
exceed this; and the MicroTCA specifica¬ 
tion allows for up to 12.5 Gbits/s per port. 

At present, AdvancedMC modules 
are available or under development for 
general-purpose processing (x86 & PPC), 
Digital Signal Processing (DSP), Digi¬ 
tal Signaling (El/Tl/Jl, OCx, DS3, etc). 
Serial ATA (SATA) and Serial Attached 
SCSI (SAS) storage, GigE/lOGigE, Wire¬ 
less Broadband (WiMAX), Voice-over-IP 
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(VoIP), VGA video, and even PCI/PCI 
Telecom Mezzanine Card (PMC/PTMC) 
carriage. While the majority are focused 
specifically on telecommunications, more 
than a few provide applications in other 
solution spaces, such as military, aero¬ 
space, industrial and medical. 

AdvancedMCs were designed to 
integrate into the highly available envi¬ 
ronment of ATCA, and this support car¬ 
ries over to the MicroTCA environment. 
Offering hot swap. Intelligent Platform 
Management Interface (IPMI), dynamic 
fabric negotiation, power budgeting and 
more, MicroTCA covers the availabil¬ 
ity, serviceability and manageability re¬ 
quirements of many target markets. In 
addition, unlike the ATCA blade fabric, 
AdvancedMC fabrics can differ between 
MicroTCA (and ATCA) carriers (within 
the same or different shelves), allowing 
the most appropriate choice for the appli¬ 
cation. As such, the lower overall cost of 
these modules will allow a less fabric-sen¬ 
sitive ecosystem to develop. 

VME/MicroTCA Convergence 

At first blush, MicroTCA appears 
to compete with VME, especially with 
next-generation VME-h fabric solutions, 
such as VXS (VITA 41) and VPX (VITA 
46). While there is clearly a choice that 
must be made when considering general- 
purpose computing platforms, VME con¬ 
tinues to be the logical choice in many 
military, industrial control and medical 
imaging applications. Because VME has 
a large ecosystem of COTS and custom 
I/O targeting military applications, it will 
continue to be a critical architecture for 
many years to come. 

The purpose-built backward compat¬ 
ibility of each successive revision to the 
VMEbus standard allows many VME 
edge systems to be refreshed with single 
board computers that support POSIX op¬ 
erating systems, open middleware, 2eSST 
and multiple GigE interfaces. These SBCs 
can integrate into environments with ease, 
yet still continue to interface with legacy 
I/O devices. In many cases, they can also 
communicate internally at fabric speeds 
via 2eSST—all in an existing chassis. 

However, some of the traditional roles 
VME has played—such as concentrated 
digital signal processing systems and 
compute centers—can migrate to alternate 


technologies such as network-centric Mi¬ 
croTCA communications servers. A het¬ 
erogeneous blend of VME-based I/O sub¬ 
systems supporting traditional VMEbus 
I/O will exist, primarily at the edge of the 
system, interfacing to weapons or sensors. 
Concurrently, compute-centric functions, 
along with some digital signal processing 
systems, can take advantage of the com¬ 
pute density of MicroTCA. 

How Will it Work? 

Software standardization efforts will 


allow the development of heterogeneous 
environments where common software 
architecture is used throughout the sys¬ 
tem. VME and MicroTCA-based systems 
will run common operating systems and 
middleware optimized for that architec¬ 
ture. Applications running on VME pro¬ 
cessor blades will share data through the 
middleware, using plug-in modules that 
optimize the transport for 2eSST, while 
MicroTCA systems will have transport 
plug-ins optimized for GigE or PCI Ex¬ 
press. Data sharing between systems 
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will continue to use GigE to share data 
between applications running on differ¬ 
ent systems. Abstracting to a middleware 
product, like DDS, eliminates the need for 
application software to be written for one 
particular system architecture. Contrac¬ 
tors can evaluate new technologies, such 
as MicroTCA, as candidates for upgrad¬ 
ing hardware, while architects designing 
new systems can pick and choose the most 
appropriate architecture for each part of 
the system (Figure 1). 

New technologies such as WiMAX 
(IEEE 802.16) can also be deployed us¬ 
ing MicroTCA communications servers. 
WiMAX is a wireless broadband technol¬ 
ogy that is being considered for systems 
that will help control the cost of wireless 
high-speed transport. Platform variants de¬ 
fined in the MicroTCA standard, like cubes 
and Pico shelves, are well suited to support 
network nodes based on WiMAX. At the 
same time, rugged variants of MicroTCA 
communications servers will move Mi¬ 
croTCA into the field and into vehicles. 

In order to take this one step fur¬ 
ther and to determine how AdvancedMC 


blades and MicroTCA chassis components 
can be ruggedized for a variety of envi¬ 
ronments, a Special Interest Group (SIG) 
has been formed by a collection of com¬ 
panies interested in MicroTCA. The goal 
of the SIG is to allow the use of COTS 
AdvancedMC modules in both commer¬ 
cial and military/industrial applications. 
In addition, the “ruggedization” SIG is re¬ 
searching how new packaging techniques 
can be applied to commercial Advanced¬ 
MC s to harden them sufficiently for harsh 
environments and whether conduction- 
cooled systems can be designed around 
them. If COTS AdvancedMCs can be 
used, the overall system cost will be lower 
and the variety of blades available to an 
application will dramatically increase. 
The work of the SIG will be part of an 
official standards body working group, so 
the completion of a ruggedization specifi¬ 
cation should coincide with the release of 
the MicroTCA standard. 

VME64x still offers a great deal 
of variety, but at reduced backplane 
bandwidth. VITA 41 will increase the 
backplane bandwidth with the addition 


of fabric interfaces, but this will require 
a system retrofit. MicroTCA has a much 
smaller footprint and the ability to scale 
better than a VXS-based system. In addi¬ 
tion, the AdvancedMC ecosystem has al¬ 
ready begun to grow prior to the release of 
the MicroTCA system standard. 

The use of AdvancedMCs in a vari¬ 
ety of applications both on ATCA carrier 
blades and within MicroTCA communi¬ 
cations servers will help to substantially 
drive down their cost. Moving forward, 
future systems will be able to leverage Ad¬ 
vancedMC fiexibility, without the added 
expense and infrastructure requirements 
of ATCA. Furthermore, applications that 
can benefit from the increased power en¬ 
velope of ATCA can also benefit from 
the variety of AdvancedMCs that will be 
available for both environments, d 
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1U Server Solutions 


Non-Traditional Blade 
Server Applications 


As blade servers pack increasingly more punch into smaller form-factors, 
their use is spreading beyond data center applications. The integration 
of newer technologies makes blades attractive for high-performance 
computing purposes such as distributed computing, rendering/imaging 
and data analysis. 


by Laura P. Cooper 
NextCom 

B lade servers have gained ground 
over the past few years as an effi¬ 
cient, condensed computing solu¬ 
tion in large data center implementations. 
They are widely used as replacements for 
pizza box servers and large rackmounts 
(Figure 1). 

The advantages inherent in blade 
server technology include smaller form- 
factors, denser computing, expandabil¬ 
ity, hot-swap capabilities, fiexible and 
fail-safe architecture, reduced downtime. 


increased redundancy, simplified server 
management, easier hardware and soft¬ 
ware integration, and lower heat dissipa¬ 
tion and power requirements. These im¬ 
provements over traditional data center 
servers provide for a massive increase in 
deployable resource density and an overall 
reduction in long-term costs. 

Each blade in a chassis is typically 
a self-contained server. Data center con¬ 
solidation, advanced communications and 
remote management of servers required 


to run 24/7 are just a few of the reasons 
for large-scale deployments. These de¬ 
ployments are becoming more popular for 
telecom, telephone and cellular carriers, 
insurance companies, tax preparers, state 
and local government agencies and educa¬ 
tional institutions, among others. 

Although blade servers have a long 
way to go before they are the standard 
deployed technology throughout the data 
center, they are beginning to appear in 
less traditional implementations such as 
high-performance computing (HPC). 

Newer Blade Technology 
Enables High-Performance 
Functionality 

Many of the characteristics that en¬ 
able the use of blades in HPC applications 
have emerged in recently introduced blade 
servers. As they become increasingly 
smaller and more powerful, blades are in¬ 
corporating technology such as open stan¬ 
dard architectures, multicore processors, 
PCI expansion for multiple I/O functional¬ 
ity, the ability to house multiple OSs, low- 
power processors and innovative cooling 
techniques, standard AC electrical con¬ 
nectivity, daisy-chaining and Gigabit 
Ethernet ports. When these are combined 
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Figure 1 


Blade servers are gaining ground as an efficient, condensed computing 
solution in large data center implementations. 
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AMD 64 Technology with Direct Connect Architecture 

Separate memory and I/O paths 
Processor die eliminate most bus connection 



can improve the use of system resources and high-end components. 
(Courtesy of Advanced Micro Devices) 


with the growing trend of virtualization, 
blade functionality increases even more. 

The great advantage of an open ar¬ 
chitecture is that anyone can design add¬ 
on products for it. An open architecture 
also increases the potential for partner¬ 
ships that enable the integration of design 
cycle management tools. By integrating 
the tools engineers already use within 
the same user interface, open design en¬ 
vironments enable greater productivity. 
Innovative products, open standards and 
interoperability are fundamental to the 
mainstream adoption of new services in 
the HPC world. 

The emergence of multicore pro¬ 
cessors gives blade manufacturers even 
denser computing options. Chips such as 
Advanced Micro Devices’ dual-core Op- 
teron processor offer significantly greater 
computing performance than equivalent 
single-core devices, yet produce no addi¬ 
tional heat and do not require additional 
power. The consolidated processing 
strength of multiple cores enables next- 
generation systems and opens up new 
possibilities for very high-performance 
blade servers. 

The reduction of bottlenecks among 
processors and system components in 
AMD’s Direct Connect architecture, for 
example, makes possible more efficient 
use of current system resources as well as 
tomorrow’s high-end components (Figure 2). 
In particular, dual-core processing is well 
suited to large clusters. The processing ca¬ 
pability of two cores on one chip makes 
possible server and workstation consoli¬ 
dation. The efficiency of this technology 
also opens up a new level of high-perfor¬ 
mance computing and the fiexibility to 
design more innovative solutions that di¬ 
rectly address specific customer needs. 

The quest for greater performance in 
embedded and high-performance applica¬ 
tions has pushed systems manufacturers 
to incorporate PCI slots into blades. PCI 
expansion slots are being implemented 
in newer blade servers. Higher-perform¬ 
ing PCI-X clock speeds up to 133 MHz 
pump throughput up into the gigabyte/ 
second range. This speedy bus is ideal for 
critical cards such as frame grabbers and 


higher-end control cards. PCI-compatible 
graphics cards can now be added directly to 
the server for use in backracking, machine 
vision and imaging implementations. 

PCI-X is viewed by many as a logical 
way to combine more robust bandwidth 
and greater cost efficiency in data-intensive 
environments. Applications range from 
medical imaging to industrial controls to 
virtual private networking, as well as com¬ 


munications systems and storage area net¬ 
work products such as clustered servers. 
A PCI-X solution usually offers 32-bit or 
64-bit modes and is backward compatible 
with existing PCI configurations. 

The ability to house multiple OSs, 
previously a convenience, is quickly be¬ 
coming a requirement in high-perfor¬ 
mance applications. Within a single rack 
the newer blades can each run a different 


Advantages of Virtualizing in Server 

Advantages of Virtualizing in Testing 

Applications: 

and Development Environments: 

• Multiple server functions via one machine 

• Enables consolidation of multiple test and 

• Applications virtualized on one machine can 

development applications 

communicate efficiently 

• Faster time-to-market and increased 

• Concurrently run workstation and server 

profitibility due to shorter test cycles 

applications on one computer 

• Optimize performance while reducing 

• Utilize more overall processing power, 

overhead and IT managment costs 

memory and storage 

• Test 32- and 64-bit applications on secure. 

• Reduce costs by minimizing physical 

modern hardware 

computer needs 

• Seamlessly manage and test legacy 

• Move legacy software to newer, more 
efficient hardware platforms 

applications across heterogeneous 
envirionments 

• Virtual fail-safe structure; if one virtual server 


fails, another takes over 



Figure 3 


Virtualization partitions a server into several virtual machines, each able 
to run its own OS and application environment. 


May 2006 25 

























SolutionsEngineering 



Figure 4 


NextCom’s reconfigurable NextServer 416 incorporates the latest 
technology in a 4U footprint to provide a solution for blade server 
consolidation, cluster and high-performance computing and reliable data 
storage. 


OS, either in single or dual boot mode, 
or, even more efficiently, two concurrent 
virtualized OSs. Ideally, a blade would 
provide interoperability with all major 
OS offerings, including Windows Server 
and XP Pro, RedHat, SUSE, Fedora Core 
Linux, and, in some cases, even Solaris 10. 
Off-the-shelf software applications could 
be integrated seamlessly and most custom 
applications could be installed with little 
or no reconfiguration. 


As blades become smaller and more 
efficient, low-power processors and in¬ 
novative cooling techniques are emerg¬ 
ing that minimize power and cooling 
requirements while increasing maximum 
processor core density. A blade can fur¬ 
ther reduce wattage with a combination of 
the proper management of power require¬ 
ments and the customization of hardware 
configurations to meet the needs of the 
application and task. A mix of active and 


passive cooling techniques allows form- 
factors to be reduced and blades to be 
housed in much smaller rooms than re¬ 
quired previously. 

Single-phase AC power sourcing is a 
major development in blade server tech¬ 
nology. It simplifies electrical efficiencies 
and reduces data center infrastructure 
requirements by eliminating the need for 
hard-wiring and expensive transformer 
circuitry. Older server technology incor¬ 
porates specialized three-phase, 208V 
input connections. The expense of rewir¬ 
ing a server room to accommodate this 
type of connectivity figures largely into 
the overall total cost of ownership. Newer 
blades are often designed to run on a stan¬ 
dard 110/220V single-phase wall socket 
by utilizing advanced load-balancing and 
integrated hot-swappable power distribu¬ 
tion buses that results in built-in redun¬ 
dancy and substantially lower cost. 

The ability to daisy-chain several 
blade servers frees up more space and 
packs more computing horsepower into 
the same square footage by stacking 
several chassis in a rack. Some blade 
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manufacturers are now incorporating 
built-in keyboard/video/mouse (KVM) 
switches and built-in inter-chassis daisy- 
chain ports, which eliminate unnecessary 
cabling and the need for separate external 
monitors. This can provide an aggregate 
interconnect rate of up to 10 Gbytes/s. 
Connectivity is also becoming more ef¬ 
ficient to storage area networks via Fibre 
Channel and by the use of Gigabit Ether¬ 
net for creating virtual LANs. 

Virtualization and Blades: The 
Dense Computing Solution 

Virtualization, an old idea in the 
server community, has recently re- 
emerged as a solution to the ever grow¬ 
ing problem of underutilized physical 
servers. It partitions a server into several 
“virtual machines,” each capable of run¬ 
ning its own OS and application environ¬ 
ment. When blade server technology is 
combined with virtualization, the result 
is a massive increase in functionality. For 
example, VMware’s middleware solution 
allows a host OS to simultaneously run a 
guest OS as a virtual layer, with a mul¬ 
titude of OS configuration options. The 
optimal virtualization implementation is 
applied to the highest-performing proces¬ 
sors. When high-performance capabili¬ 
ties are applied to the smallest available 
blade servers, virtualization is at its most 
efficient, enabling space minimization, 
system efficiency and better application 
utilization (Figure 3). 

HPC Blade Implementations 

The decision to employ blade servers 
for non-traditional applications is based 
on the same benefits that make them at¬ 
tractive for the data center. Distributed 
computing, rendering/imaging, number¬ 
crunching processes, test and measure¬ 
ment data analysis, content manipulation, 
server appliances or gateways, and hetero¬ 
geneous computing using mixed OSs are 
just a few of the areas expected to utilize 
blades for HPC. 

Computing appliance, or dedicated 
hardware, applications are demanding 
a move from general-purpose comput¬ 
ing to a model that is both powerful and 
fiexible, which combines the strengths 
of the multi-purpose model with the ap¬ 
pliance concept. Simple, reliable devices 
are required that facilitate repeated tasks. 


Used as appliances, newer blades provide 
both the user interface and the “box.” By 
utilizing open standard architecture and 
off-the-shelf OSs developed for appliance 
applications, blades can be fine-tuned to 
give the optimal performance of a specific 
desired service, while non-required ser¬ 
vices are disabled. 

As the technology of instrumentation 
used for signal capture and analysis be¬ 
comes more complex, so does the need for 
computers providing immense processing 
capabilities and flexibility. High-quality 


measurement and analysis of incoming 
data is crucial, and high-resolution dis¬ 
plays are required for visually rendering 
much of that data. Aerial and satellite im¬ 
age analysis, automated mapping, detec¬ 
tion of human activity, change detection 
and perceptual organization are some of 
the processes that require intensive com¬ 
putational processing. 

Although signal analysis software 
that performs these functions often runs 
on standard OSs such as Windows XP 
Pro, these complex calculations require 
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more intensive processing than a standard 
PC can provide. Blades can process large 
amounts of data, as well as allow multiple 
monitor hookups. The same requirements, 
including imaging/rendering needs, apply 
to test and measurement data analysis. 

The integration blade technology 
brings is poised to define a new generation 
of distributed computing. Backracking, or 
centralization, is increasingly being de¬ 
ployed as the architecture of choice for 
both small and large desktop applications. 
Its benefits include the elimination of costs 
for moves, adds and changes, which in¬ 
creases system uptime; the centralization 
of support; and built-in security for both 
data protection and disaster recovery. In a 
backracked environment, a blade acts as a 
central point that houses the OS and appli¬ 
cations, distributing them as requested. In 
this configuration, 100 or more comput¬ 
ers can be stored in a single, centralized 
rack. With blades in use as desktop PCs, 
computing power is ensured while main¬ 
tenance is consolidated and platforms and 
systems can be quickly upgraded as new 
technologies become available. 

Industry Adoption 

Key vertical industries, including oil 
and gas, are beginning to utilize blades 
for these non-traditional applications. The 
newest blade servers are well suited to 
seismic data analysis, data manipulation, 
visual rendering via FireWire interfaces 
and data storage. Military applications are 
beginning to utilize blades as well, such 
as signal detection and analysis, surveil¬ 
lance, data analysis and manipulation, and 
visual rendering of data. Other industries 
include test and measurement and life sci¬ 
ences, both of which require intensive, 
number-crunching processes, as well as 
analysis, manipulation and rendering of 
data. A less obvious implementation is 
in the field of media, where digital video 
capture, digital content creation, editing 
and visualization are requiring increas¬ 
ingly greater processing capabilities in a 
smaller space. 

The Emergence of Fully 
Capable Blade Servers 

Many vendors are recognizing the 
need to address these less common de¬ 
ployments of blade server technolo¬ 
gies by adopting some, if not all, of the 


requirements needed to perform these 
tasks. For example, NextCom’s Next- 
Server 416 (Figure 4) is a high-availability, 
extreme performance platform that uses 
the latest technology to provide a solution 
for server consolidation, cluster and HPC, 
along with reliable data storage in a small 
footprint, 4U reconfigurable server. 

The platform’s open standard archi¬ 
tecture supports AMD’s dual-core Op- 
teron processor and 64-bit Intel EM64T 
Xeon clustered or independent blade 
computing. Its performance and form- 
factor fit the needs of embedded, distrib¬ 
uted computing, high-performance imag¬ 
ing and interconnect applications. The 
aggregate data rate is up to 10 Gbits/s 
and the platform supports daisy-chained 
KVM, remote management and alarming, 
and blade hot-swappability. 

Additional blade options include Fi¬ 
bre Channel, Gigabit Ethernet, hard disk, 
fiash disk, PCI-X I/O expansion and SCSI. 
External monitors can be hooked up to 
each blade for remote monitoring and 
visualization. Multiple networks for dif¬ 
ferent functions and/or redundancy can 
run simultaneously. On-blade storage and 
external network-attached storage provide 
fiexibility for partitioning applications 
and data. Blades can be packaged with 
32-bit and 64-bit workstation and enter¬ 
prise server versions of Windows, Linux 
and Solaris. 

High-performance, open standard 
computing is becoming more common¬ 
place across an increasing number of 
technology-rich industries. The recent 
implementation of this technology in 
blade servers allows non-traditional ap¬ 
plications to maximize efficiency and 
performance. The use of blade servers 
for non-traditional, high-performance 
applications will likely increase, just as 
blade servers will continue to incorporate 
even more functionality. Smaller form- 
factors, earlier adoption of new develop¬ 
ments in multicore processor technology, 
increased fiexibility and expandability, 
and increased power efficiency are likely 
to emerge as technology and application 
needs evolve, d 
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10 Gigabit Ethernet and Beyond 


10 Gigabit Ethernet 
Backplanes Make ATCA 
Chassis COTS 


Although Etheruet’s use in the backplane has been significant, its high 
latency has limited potential applications. Low-latency 10 Gigabit Ethernet 
switch chips and network interface chips are changing that picture. 


by Bud Noren 

Fulcrum Microsystems 


A key element in the future of 10 Giga¬ 
bit Ethernet is the drive to lower its 
latency and make it the de facto in¬ 
terconnect technology for COTS systems. 


As engineers begin to rethink chassis de¬ 
sign with the aim of transitioning their 
systems to the AdvancedTCA form-fac¬ 
tor, they are increasingly considering 10 


Gigabit Ethernet as the ultimate unifier of 
the data center or central office. 

With its effort to develop a standard¬ 
ized chassis form-factor, ATCA repre¬ 
sents one of the most significant thrusts 
into COTS systems that are currently un¬ 
derway. The promise of ATCA is that a 
single chassis will be able to house boards 
from a variety of manufacturers and allow 
system designers to buy the right function¬ 
ality for their applications without being 
held hostage to a vendor’s closed system. 

AdvancedTCA balances this open¬ 
ness with a chassis design that delivers the 
performance needed for next-generation 
telecommunications systems. A key part 
of that is ensuring that the system back¬ 
plane walks the line between high perfor¬ 
mance and off-the-shelf fiexibility. In this 
environment, Ethernet is a natural choice 
because it is already the dominant tech¬ 
nology used for the data flowing through 
the network: about 85% of all networks 
are Ethernet-based. That broad adoption 
also means that the supporting software 
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ATCA Backplane Zones 


The ATCA backplane is divided into 3 zones for data transport, power and management 
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Zone 3 is for front board to RTM 
interconnection (connector undefined) 


Zone 3 


J 


Rear I/O Access Area 


Zone 2 is for base interface and fabric Zone 2 


Zone 1 is for power and 
system management 


Zone 1 



Figure 1 


In transitioning their systems to the ATCA form-factor, engineers are 
increasingly considering 10 Gigabit Ethernet for use in the backplane. 
Zone 1 of the ATCA backplane is used for power distribution and 
system management, zone 2 for data transfer and zone 3 for access to 
additional I/O. 
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and human knowledge are widely avail¬ 
able to implement systems. 

This widespread usage of the technol¬ 
ogy, in addition to its increasing perfor¬ 
mance, has driven interest among systems 
builders in using Ethernet in the back¬ 
plane, despite a perception that end-to-end 
latency and congestion management is in¬ 
ferior to alternative technologies. Work to 
adopt Ethernet for use in the backplane 
has yielded solutions that possess the fea¬ 
tures and performance to compete with 
alternative interconnect technologies to¬ 
day, and also carry the promise of even 
better performance in the future. 

Backplane Requirements of an 
ATCA Chassis 

The ATCA architecture is based on 
PICMG 3.0. Among other configurations, 
it calls for a 12U chassis with blades that 
are 8U high by 280 mm deep with a 1.2- 
inch pitch. Total board power can be up to 
20OW with forced-air cooling in the bottom 
of the chassis. Power comes from a back¬ 
plane standard that allows for distribution 
of 48 VDC power to all of the cards. 

Multiple interconnect technologies 
are supported in the ATCA specification, 
including Ethernet, PCI Express and In¬ 
finiBand in dual-star or mesh interconnec¬ 
tion. Multiple links of up to 40 Gbits/s (for 
a total capacity as high as 240 Terabits/s) 
are supported, with 99.999% uptime and 
central office-level quality-of-service. 

The backplane is divided into three 
zones (Eigure 1). Zone 1 is for power dis¬ 
tribution and system management, zone 2 
is for data transfer and zone 3 is for access 
to an optional Rear Transmission Module, 
typically used for additional I/O. Zone 2 
is further broken down into two parts. A 
base interface that is specified as switched 
10/100/1000 Ethernet is used for system 
management and, optionally, for data path 
traffic. An optional, higher-speed inter¬ 
face supports a higher-throughput data 
path. The fabric interface is fiexible and 
can be configured either as a full mesh or 
as a dual-star or dual-dual star, depending 
upon the application. 

The switching fabric is protocol-ag¬ 
nostic, thereby giving system designers a 
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Dual-star topology makes use of redundant hub cards added to 
the chassis that provide the backplane switching fabric. Dual-star 
configurations are designed for applications with predictable traffic flows, 
as well as for non-local traffic, such as multiplexing applications. 


choice by defining several sub-specifica¬ 
tions for specific interfaces and protocols. 
These include PICMG 3.1 for Ethernet, 
which is also used for the base interface, 
PICMG 3.2 for InfiniBand, PICMG 3.3 
for StarPabric and PICMG 3.4 for PCI 
Express. 

The dual-star topology uses redun¬ 
dant hub cards that are added to the chas¬ 
sis to provide the backplane switching 
fabric. Each node card has a channel con¬ 
nection to the hub cards across the back¬ 
plane and the hub card is responsible for 
switching all data. A backplane channel 
connects both hub cards to each other for 
redundancy. Similarly, a dual-dual star 
configuration, in which hub cards are used 
to create two entirely separate and redun¬ 
dant fabrics with fabric interfaces to each, 
is also supported. This is useful in appli¬ 
cations where extra throughput is needed. 

ATCA Chassis Configurations 

The dual-star configurations are 
designed for applications with predict¬ 
able traffic fiows (Eigure 2). More bursty 


traffic can result in higher latency as one 
node card consumes a disproportionate 
share of the switch resources. Dual-star 
backplanes are also a good design for 
non-local traffic, such as multiplexing ap¬ 
plications. A data packet that is switched 
to multiple node cards within the switch 
is subject to extra latency associated with 
the transit back and forth to the hub card. 

A switch for this application should 
have a good fiow control mechanism to 
help regulate the fiows from all of the 
node cards to alleviate congestion at the 
hub board. Additionally, a low-latency 
switch provides the broadest applicabil¬ 
ity. Increasingly, applications are becom¬ 
ing latency sensitive, especially as ATCA 
chassis are being considered for storage 
and computing systems. Those applica¬ 
tions will be severely impacted by a switch 
element that is not optimized for latency. 

In contrast to the dual-star approach, 
a full mesh topology is usually used in ap¬ 
plications that have large data throughput 
needs, or that require high levels of peer- 
to-peer processing from elements within 
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A full mesh topology works best for applications with high data 
throughput, or those requiring high levels of peer-to-peer processing from 
elements within the chassis. 


the chassis. In a full mesh configuration, a 
backplane switch resides on each node card 
and connects into a matrix of channels on 
the backplane. Those channels extend to all 


of the other node cards in the chassis. 

The mesh design (Figure 3) offers 
higher scalability, system redundancy, 
and transit efficiency. However, this 


comes at the cost of a larger number of 
trace routes on the backplane, which can 
drive up the cost of this component. Ad¬ 
ditionally, the mesh configuration typi¬ 
cally requires software enhancements 
to the applications to make them aware 
of the any-to-any connectivity provided 
by mesh topology. The high cost of the 
backplane, as well as multiple backplane 
switch chips on each node card, make 
the full mesh appropriate for smaller sys¬ 
tems, or systems that truly require single¬ 
hop latency between node cards. 

Optimizing 10 Gigabit Ethernet 
for the Backplane 

For 10 Gigabit Ethernet to be used 
in an ATCA backplane, the signals must 
be optimized for an electrical trace rather 
than for a copper or fiber cable. The key 
10 Gigabit physical layer technology used 
on the backplane is 10 Gigabit Attach¬ 
ment Unit Interface (XAUI). This has be¬ 
come the de facto standard for 10 Gigabit 
Ethernet chip-to-chip connections, and is 
popular in backplane applications because 
it is easy to implement. 
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Figure 4 



The FocalPoint switch chip 
from Fulcrum Microsystems 
is one of several low-latency 
switches aimed at ATCA 
applications. It features 
twenty-four 10 Gigabit 
Ethernet ports and a latency 
of 200 ns. 


XAUI is a parallel interface com¬ 
posed of four 3.125-GHz serial lanes, with 
a low pin count for fewer traces across a 
circuit board. Since it is derived directly 
from 10 Gigabit Ethernet, it keeps many 
of that technology’s important features, 
such as 8B/10B encoding, making the 
packet transition to XAUI fast. 

The XAUI interface is easy to design 
with because it is self-clocked, allowing 
designers to distribute the clock on their 
boards, and also delivers an inherently 
low level of electromagnetic interfer¬ 
ence. The interface can compensate for 
multi-bit bus skew, which contributes to 
its relatively long-distance specification 
of 0.5 meters. 


So even though XAUI was not made 
for backplane applications, it has become 
popular there. The IEEE is now undertak¬ 
ing the creation of a backplane-specific 


physical layer specification for 10 Gigabit 
Ethernet that would result in a physical 
layer standard optimized for backplane 
applications. 



We design solutions. 

Mercury customers span multiple industries and face unique computing challenges -whether improving yields 
for semiconductor wafer inspection, increasing throughput in medical imaging, rendering high-quality animation 
within a defined budget, or packing enormous processing capacity in deployed ground vehicles. Why are we so 
driven to tackle these difficult computing problems? Because your challenges drive our innovation. 

Our new 1U Dual Cell-Based Server significantly improves performance for computationally 

intensive HPC applications in medical, industrial, defense, seismic, telecommunications, and 
other industries. Contact us to learn how our products 
and services can optimize your challenging applications. 

Let us design an innovative solution for you. 

WWW.mC.com/rtc5 | 866-627-6951 challenges Drive innovation^'* 


Computer Systems, Inc. 


jH jf K.AJiiipuLci oysLen 


May 2006 RHH33 















Indust rylnsight 


Latency Comparisons 
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Figure 5 


Lower latencies in 10 Gigabit Ethernet switch chips are now on par with 
other backplane choices for ATCA designers, such as Fibre Channel, 
Myrinet, InfiniBand and PCI Express. 


802.3 AP 

The IEEE P802.3ap Task Eorce has 
been working on backplane Ethernet 
since May 2004, with the goal of defining 


Ethernet operation over electrical back¬ 
planes covering distances of up to 1 me¬ 
ter. This effort will combine the IEEE 
802.3 media access controller (MAC) 


operating at Gigabit and 10 Gigabit speeds 
with three new physical-layer signaling 
standards: lOOOBASE-KX, lOGBASE- 
KX4 and lOGBASE-KR. The KX4 stan¬ 
dard is based on XAUI. 

The 10 Gigabit Ethernet standard 
will be defined for parallel connections, 
such as XAUI, and for serial connections 
that will operate as a single channel at 
10 Gbits/s. The goal of the specification 
is to incorporate as much of the Ethernet 
standard as possible, including the MAC 
frame format and important services en¬ 
coding, such as Spanning Tree, virtual 
LANs (802.IQ) and management infor¬ 
mation (802.If). 

One of the key drivers of the push 
for 802.3ap is to better define the speci¬ 
fications for the serializer/deserializer 
(SerDes) transceiver circuits. Tightening 
up these specifications will provide tighter 
tolerances for the interface between the 
board and the backplane, as well as in¬ 
corporate new management mechanisms 
that will make it easier to design and 
debug backplanes. The parallel standard 
could eventually be used for a 40 Gbit/s 
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backplane, with each channel supporting 
10 Gbits/s instead of today’s typical 2.5 
Gbits/s. 

The IEEE committee defining the 
standard will keep Ethernet packet sizes 
the same and will support the existing me¬ 
dia-independent interfaces. The 802.Sap 
standard is now in draft form and should 
be completed in November of this year. 

Congestion Management 

The IEEE is also tackling congestion 
management, another issue that should 
squarely benefit the move toward 10 
Gigabit Ethernet backplanes. The 802.3ar 
study group grew out of the recognition 
by the backplane Ethernet committee that 
moving Ethernet into these applications 
increases the sensitivity to frame delay, 
delay variation and packet loss. 

Ethernet’s current fiow control capa¬ 
bilities were designed for environments 
where latency and delay variation are not 
key issues. The goal of the committee 
is to find a new congestion management 
scheme that can augment these capabili¬ 
ties. More specifically, the new standard 
will specify a mechanism to support the 
communication of congestion informa¬ 
tion and limit the data rate on an Ethernet 
link to back off traffic while congestion 
clears. All this is done without changing 
the MAC or physical layer interfaces and 
while minimizing the throughput reduc¬ 
tion in non-congested fiows. 

Low-Latency Ethernet Provides 
a Solution Today 

Although the efforts of the 802.Sap 
and 802.3ar groups will improve Ethernet 
for ATCA, low-latency Ethernet solutions 
that utilize XAUI are available now for 
ATCA chassis designers. These can pro¬ 
vide the performance necessary for the 
most demanding computing and commu¬ 
nications applications. 

Several vendors have debuted low- 
latency switch chips that help overcome 
fiow control issues and make high-perfor¬ 
mance ATCA backplanes a reality, even 
for latency-sensitive applications like voice 
and video. Switch chips are now available 
with latencies as low as 200 nanoseconds 
(Eigure 4). This is an order of magnitude 
improvement over the previous generation 
of Ethernet switch chips and on par with 
other choices for ATCA designers, such as 


Eibre Channel, Myrinet, InfiniBand and 
PCI Express (Eigure 5). 

These low latencies boost the effi¬ 
ciency of Ethernet’s existing fiow control 
capabilities by providing a quick feedback 
loop to sources so that they back off their 
transmission rate until the switch is clear. 
This allows the switch to maintain wire- 
rate performance without giant buffers. 
Designers can therefore use standard, 
off-the-shelf technology to achieve their 
goals before the standards bodies finish 
their work. 

Ethernet for ATCA is ready for 
prime time today. Because of its high 
performance, low latency, ubiquity, eco¬ 
system and proven ability to transport 
any type of data efficiently, it can help 
turn ATCA systems into the epitome of a 
COTS device. J 
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10 Gigabit Ethernet and Beyond 


Achieving 10 Gigabit 
Ethernet and Beyond 


The transition to higher speed Ethernet, from 1 GbE to 10 GbE and 
beyond, is already occurring. Several technical concerns need to be 
addressed throughout the network in order to achieve these higher 
speeds. 

by Blaine Kohl 
Tehuti Networks 


T oday, the data center is preparing 
for a volume transition from Gigabit 
Ethernet (GbE) to 10 Gbit Ethernet 
(10 GbE). This transition results from 
the growing demand for the adoption of 
higher speed Ethernet. 

The demand is coming simultane¬ 
ously from several different directions. 
Traffic is increasing in terms of both the 
number and size of transactions, due in 
part to the Internet and increased per¬ 
sonal connectivity. Data-intensive appli¬ 
cations, such as grid and cluster comput¬ 
ing, are growing. New services are being 
offered, such as On Demand Viewing, 
IPTV, broadband and video streaming. 
Bandwidth to residential and business 
clients is increasing. Einally, the exist¬ 
ing protocols, such as link aggregation 
(LAG), are limited in their ability to han¬ 
dle this growth. 

One of the key drivers for implement¬ 
ing 10 GbE is the limitations encountered 
when using the LAG protocol to aggre¬ 
gate multiple 1 GbE links. This protocol 
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802.3 Layers 


OSI Reference Higher Layers 



MEDIUM MEDIUM MEDIUM MEDIUM 

10GBASE-R 10GBASE-W 10GBASE-X 10GBASE-T 


PHY 


AN = Auto-Negotiation 
MAC = Media Access Control 
MDI = Medium Dependent Interface 
PCS = Physical Coding Sublayer 


PMA = Physical Medium Attachment 

PMD = Physical Medium Dependent 

WIS = WAN Interface Sublayer 

XGMIl = 10 Gigabit Media Independent Interface 


Figure 1 


The same MAC and reconciliation sublayer are used for all 10 GbE 
devices, but the PHYs for connecting to the network transmission media 
are different. 


is frequently used as an interim technol¬ 
ogy to bridge the gap between Ethernet’s 
existing speed and its next higher speed. 
However, it is only a temporary solu¬ 
tion, since more than four physical links 


become very difficult to manage and are 
costly to implement and maintain. Eor ex¬ 
ample, a standard network interface card 
only has enough room on its faceplate for 
four connections. 
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Standard 

Ratification 

PHY types 

Medium 

Reach 

or Draft 

Date 

(max) 



10GBASE-SR 

MME 

300m 



10GBASE-LR 

SMF 

10km 



10GBASE-ER 

SMF 

40km 

Std. 802.3ae 

2002 

10GBASE-LX4 

MMF/SMF 

300m/10km 



10GBASE-SW 

MMF 

300m 



10GBASE-LW 

SMF 

10km 



10GBASE-EW 

SMF 

40km 

Std. 802.3ak 

2004 

10GBASE-CX4 

Coax 

15m 

P802.3an 

June 2006* 

10GBASE-T 

Cat 6 UTP or better 

100m 

P802.3aq 

Sept 2006* 

10GBASE-LRM 

MMF 

220m 

P802.3ap 

Mar 2007* 

10GBASE-KX4 

4-lane FR4 backplane 

1m 

10GBASE-KR 

1-lane FR4 backplane 

1m 


* Expected ratification date. 


Figure 2 


Latency increases as processors wait for memory to move data 
across the network. A network traffic accelerator (NTA) can free up 
CPU cycles by optimizing the system’s ability to process packets. 


Another limitation of the LAG pro¬ 
tocol is the overhead incurred with new 
links. For every link added to the exist¬ 
ing LAG group, the overhead on each link 
already in the group increases slightly. 
The advantages of 10 GbE are the result¬ 
ing lower latency due to the elimination of 
the LAG protocol and its associated over¬ 
head, as well as higher bandwidth via a 
single 10 GbE link. In addition, 10 GbE 
offers a wide range of cable runs, from 15 
meters on coax cables to 40 kilometers on 
single-mode fiber (SMF). 

10 GbE Applications 

Although 10 GbE products, such as 
switches and adapters, have existed since 
the standard for 10 GbE over fiber-optic 

Before NTA 


transmission media (IEEE Std. 802.3ae) 
was ratified in 2002, certain impediments 
have prevented this technology from be¬ 
ing widely adopted. With the ever-increas¬ 
ing need for higher bandwidth, these ob¬ 
stacles will be overcome within the next 
few years to permit 10 GbE to penetrate 
into volume platforms. 

The higher speed and faster response 
time of 10 GbE can benefit any traffic-bur¬ 
dened network. This is particularly true 
for data-intensive applications. Examples 
include enterprise financial applications, 
database and data modeling simulations, 
weather forecasting, computer-aided de¬ 
sign and manufacturing, and graphics-in¬ 
tensive applications such as those found in 
computer games and movies. 

After NTA 




Figure 3 


Since the original 10 GbE standard (IEEE 802.3ae) was ratified, 
amendments specifying a variety of new PHYs have been developed. Two 
new PHYs will be added in 2006, and two more in 2007. 


Previously, Gigabit Ethernet found 
its way into high-performance computing 
cluster applications. As costs and obstacles 
decline, 10 GbE will grow in this market. 
Some of the more controversial applica¬ 
tions include the possibility that at 10 Gbit/ 
s speeds, other technologies such as iSCSI 
will begin to make economic as well as 
technical sense, potentially replacing exist¬ 
ing protocols such as Fibre Channel. 

Metro area networks (MANs), as well 
as long distance local area network (LAN) 
applications, benefit greatly from the lon¬ 
ger distances available via fiber optics. 
One example is the ability to place data 
centers where they are most advantageous 
in terms of cost and convenience, such as 
in offsite locations away from primary fa¬ 
cilities. The longer reach of 10 GbE and 
its ability to map to existing SONET/SDH 
10 Gbit/s protocols also make it very suit¬ 
able for wide area network (WAN) and 
Internet points of presence applications to 
implement a network capable of transmit¬ 
ting data at terabyte-per-hour rates. 

What 10 GbE Is and Is Not 

In many ways 10 GbE is a straight¬ 
forward extension of previous Ethernet 
speeds, providing full backward compat¬ 
ibility to previous Ethernet generations. It 
retains the key Ethernet architecture, the 
media access controller (MAC) protocol, 
the Ethernet frame format and the mini¬ 
mum and maximum frame size. 

However, there are also some key dif¬ 
ferences, involving primarily the physical 
layer (PHY) of the Open Systems Inter¬ 
connection (OSI) layer model (Figure 
1). Although the same MAC and recon¬ 
ciliation sublayer are used for all 10 GbE 
devices, the PHYs for connecting to the 
network transmission media are different. 
In addition, unlike previous generations of 
Ethernet speeds, 10 GbE operates only in 
full-duplex mode. This is because by 1999, 
the use of switched Ethernet had grown to 
the point that support for the half-duplex 
mode of operation was considered a bur¬ 
den to the development of 10 GbE. 

The original 10 GbE standard (IEEE 
802.3ae) specified a variety of PHYs for 
connection to optical multi-mode fiber 
(MME) and SME transmission media. 
Since its ratification, several new amend¬ 
ments have been developed or are cur¬ 
rently in development. In 2004, IEEE Std. 
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Figure 4 


Tehuti Networks’ 10 GbE network traffic accelerator (NTA) chips, shown 
here mounted on a network interface card, reduce latency by optimizing 
the system’s ability to process packets. 


802.3ak-2004 specified a 10GBASE-CX4 
PHY for the transmission of 10 GbE sig¬ 
naling over 15 meters of coax cabling for 
short-reach applications. In 2006, two 
new PH Ys will be added, and two more in 
2007. The new PHYs address expanding 
market needs for 10 GbE (Eigure 2). 

Even though two 10 GbE PHY stan¬ 
dards have existed for a number of years, 
the cost of the 10 GbE optical solutions 
and the short reach of 10 GbE coax so¬ 
lutions have limited widespread adoption 
of the technology. The cost and power of 
optics solutions continue to decrease as 
the technology matures, but the lack of 
a standard that allows 10 GbE to oper¬ 
ate over unshielded twisted-pair (UTP) 
copper cabling relegates 10 GbE only to 
applications where it is critical. Ratifica¬ 
tion of the lOGBASE-T standard (IEEE 
P802.3an), which addresses a UTP solu¬ 
tion, is targeted for June 2006. Once this 
occurs, lower cost 10 GbE copper PHYs 
will begin to appear. 

Some common misperceptions are 
associated with lOGBASE-T. Confusion 
reigns about the support of auto-negotia- 
tion, but the lOGBASE-T standard does. 


indeed, support auto-negotiation. The 
auto-negotiation feature will permit im- 
plementers to design PHY devices that 
operate over a broader range of speeds 
and media. Although auto-negotiation 


will provide backward compatibility 
with lower speeds, it is unlikely that a 
lOGBASE-T system will be able to sup¬ 
port operation at 10 Mbits/s due to the re¬ 
quirements of the transformer. 
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Another common misperception 
concerns support for Category 5 (CAT-5) 
cabling. CAT-5 and CAT-5e cabling are 
not listed as supported media in IEEE 
P802.3an, since lOGBASE-T performance 
requirements far exceed the cabling’s 
specified performance. The lOGBASE- 
T draft calls for the support of CAT-6, 
CAT-6 Augmented (CAT-6A) and CAT-7 
cabling. The Telecommunications Indus¬ 
try Association’s Telecommunications 
Systems Bulletin (TSB)-155 describes the 
requirements an existing cabling installa¬ 
tion must meet to transport lOGBASE-T 
signaling. Existing CAT-6 installations 
that meet TSB-155’s requirements should 
support a reach of up to 55m. CAT-6A and 
CAT-7 support a reach of up to 100m. The 
primary difference between CAT-6A and 
CAT-7 is that CAT-6A is unshielded cable, 
whereas CAT-7 is shielded. 

At the board level, the optimal in¬ 
terconnect will be primarily determined 
by the reach required. 10GBASE-CX4 
provides a low-cost, higher performance 
solution for short-haul applications. It 
is being used as a high-performance 


interconnect for the clients of client-server 
or client-workstations involved in data-in- 
tensive applications, as well as a replace¬ 
ment for proprietary fabrics in storage 
area networks. 

Eor networks that require longer-haul 
distances, such as MANs and WANs, 10 
GbE offers the ability to reach up to 40 
km. This provides large potential ben¬ 
efits in terms of much faster and more 
efficient access for applications such as 
telecommuting, video conferencing, data 
mining and online research. In WAN ap¬ 
plications, 10 GbE allows the transport of 
terabytes of data in world record-breaking 
times, linking communities around the 
globe. The ability of 10 GbE to effectively 
attach directly to the SONET/SDH core 
network or to be transported over unused 
wavelengths minimizes or eliminates the 
previous need for protocol translation in 
order to achieve global connectivity. 

What is Needed to Achieve 10 
GbE Speeds? 

Implementing 10 GbE to achieve 
true 10 Gbit/s speeds requires more than 


simply choosing the right medium. Much 
of 10 GbE system throughput must occur 
in both the infrastructure, including rout¬ 
ers and switches, as well as in the end¬ 
points, which are primarily servers, appli¬ 
ances and networking storage equipment. 

To meet these demands, several tech¬ 
nical and economic concerns must be met. 
Increasingly, higher-performance, more 
expensive systems are required to keep 
pace with network throughput. However, 
because of design constraints, the memory 
subsystem cannot keep up physically. 

This means that latency is increas¬ 
ing as processors wait through hundreds 
of idle cycles while memory works at its 
slower rate to move data across the network 
to the waiting applications. Both proces¬ 
sors and memory subsystem are literally 
being fiooded by the network bandwidth, 
so the industry has been seeking ways to 
alleviate this bottleneck. This disparity is 
impacting the performance of today’s sys¬ 
tems that are designed to handle critical 
real-time transactions and bandwidth-in- 
tensive applications, such as e-commerce, 
medical imaging and data warehousing. 
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One of the most optimal approaches to 
solving this bottleneck problem is the use 
of a TCP/IP accelerator solution. A TCP/ 
IP accelerator, or network traffic accelera¬ 
tor (Figure 3), such as the chip offered by 
Tehuti Networks (Figure 4), reduces hood¬ 
ing by optimizing the system’s ability to 
process packets. Implemented from the 
wire to the application, accelerator tech¬ 
nology dramatically speeds up enterprise 
platform-to-network communications by 


reducing latency, redistributing functions 
to components that will do each task bet¬ 
ter and using additional enhancements to 
improve server system throughput. This 
allows the servers to maintain an optimal 
balance of performance and low power 
while requiring only minimal board real 
estate, thereby reducing the equipment’s 
total cost of ownership. 

In addition, interfaces are an im¬ 
portant consideration. For example, the 


64-bit PCI-X bus in the current Intel server 
architecture already pumps out data in the 
multi-gigabit range, whereas the new PCI 
Express architecture will easily scale to 
handle 10 GbE. 

What Comes After 10 GbE? 

Although the industry is in the throes 
of preparing for the ramp of 10 GbE into 
high volumes, history reveals that it is al¬ 
ready time to begin thinking about how a 
higher speed Ethernet will come to frui¬ 
tion. The concerns discussed above will 
only be further exacerbated, and new con¬ 
cerns added, as the market looks toward 
the next speed generation after 10 GbE. 

Some predictions state that by 2012, 
the next higher speed Ethernet adoption 
could mirror that of 10 GbE and approach 
a 50% share of primary interconnect in 
the Top 500 supercomputing sites. It is 
anticipated that in July 2006 there will 
be a Call for Interest (CFI) in IEEE 802.3 
for a Higher Speed Study Group (HSSG), 
which could result in the ratification of the 
next higher speed Ethernet standard some 
time around 2010. 

10 GbE shares many similarities 
with previous generations of Ethernet, but 
there are also some key differences. In 
the 10 GbE standard itself, many of the 
differences are related to the transmis¬ 
sion media. However, in order to achieve 
maximum 10 GbE performance, the sys¬ 
tem must be configured to support 10 GbE 
throughout the network. This includes im¬ 
proving TCP/IP processing via hardware 
assist so the CPU has more cycles for 
application processing, as well as mak¬ 
ing sure that the right interfaces, such as 
PCI Express, are in place. It also includes 
chipsets with memory subsystems that do 
not limit system performance. Many of 
these issues are being addressed today. In 
the very near future, the market can look 
forward to enjoying complete 10 GbE per¬ 
formance and all of its benefits, d 
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FPGAs: The New Matrix for Design 


Changing Horses in Midstream: 
Partial Reconfiguration 
for FPGA Designs 


The ability to leverage partial reconfiguration for programmable logic opens 
new doors to a whole host of applications such as software defined radio, 
dynamic instruction set computing and automatic target recognition. 


by Mark Goosman 
Xilinx 


® ngoing trends in logic design such as shorter product life- 
cycles, greater design complexity and increased scaling 
(e.g., “Moore’s Law”) are adding increasing pressure for 
faster design speed and lower power in a smaller physical space. 
Although today’s advanced FPGAs are rapidly evolving to ad¬ 
dress these issues of speed, power and size, new technologies 
in the area of partial reconfiguration offer the promise of even 
greater advances. 

Partial reconfiguration is a design process that allows a lim¬ 
ited, predefined portion of an FPGA to be reconfigured while the 
remainder of the device continues to operate. This is especially 
valuable where devices operate in a mission-critical environment 
and cannot be disrupted while subsystems are redefined. The 
ability to partially reconfigure a device takes the already power¬ 
ful benefits of reprogrammability to a much higher level. 

The obvious benefit of reconfigurable devices, such as 
FPGAs, is that the functionality with which a device is config¬ 
ured can be changed and updated at some time in the future. As 
additional functionality is available or design improvements are 
made available, the FPGA can be shut down, completely repro¬ 
grammed with new logic and operations can be resumed. Partial 
reconfigurability addresses the environment where logic needs to 
be changed or updated within a part of an FPGA without disrupt¬ 
ing the entire system. This may be a design comprised of several 
blocks of logic and, without disrupting the system and stopping 
the fiow of data, requires an update of the functionality within 
one block. 

Using partial reconfiguration, designers can dramatically 
increase the functionality of a single FPGA, allowing for fewer. 


smaller devices than would otherwise be needed. This allows 
for additional functionality, lower power, reduced cost and less 
physical space on the board. 

Partial reconfiguration is useful for systems with multiple 
functions that can time-share the same FPGA device resources. 
In such systems, one section of the FPGA continues to operate 
while other sections of the FPGA are disabled and reconfigured 
to provide new functionality. This allows concurrent support 
for multiple independent applications in a single FPGA. This is 
somewhat analogous to dynamic task switching or multitasking 
of a general-purpose processor. Without this capability, it would 
be necessary to reconfigure the entire FPGA to support a dif¬ 
ferent application, which would result in the loss of all previous 
applications. 

Partial reconfiguration provides an advantage over multiple 
full bit streams in applications that require continuous operation, 
which is not otherwise accessible during full reconfiguration. 
One example is a graphics display that utilizes horizontal and 
vertical synchronization. Because of the environment in which 
this application operates, signals from radio and video links 
need to be preserved—but the format and data processing format 
may require updates and changes during operation. With partial 
reconfiguration, the system can maintain these real-time links 
while other modules within the FPGA are changed on-the-fiy. 

In order to implement partial reconfiguration on an FPGA, 
it first requires an FPGA that inherently supports the dynamic 
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reconfiguration of only portions of the device, while leaving the 
other portions unaffected. Then a set of software development 
tools are needed that support the development of applications 
restricted to boundaries that comply with the hardware architec¬ 
ture of the FPGA. Finally, some form of basic controller must 
be available to dynamically manage the reconfiguration of the 
FPGA. This could be an embedded general-purpose processor 
(GPP), a soft core GPP, or an external GPP connected to the 
FPGA. In this shared resources model, the same embedded GPP 
that is running the design infrastructure and operating environ¬ 
ment is also managing the partial reconfiguration of the FPGAs. 

In an FPGA, all user-programmable features are controlled 
by memory cells that are volatile and must be configured on 
power-up. These memory cells are known as the configuration 
memory, and define the look-up table (LUT) equations, signal 
routing, input/output block (lOB) voltage standards and all other 
aspects of the design. 


To program configuration memory, instructions for the con¬ 
figuration control logic and data for the configuration memory 
are provided in the form of a bitstream, which is delivered to the 
device through the JTAG, SelectMAP, serial, or ICAP configura¬ 
tion interface. 

Typically, a user performs the initial programming by down¬ 
loading an entire bitstream to an inactive target device. Using par¬ 
tial reconfiguration, a subset of the FPGA can be reprogrammed 
using a partial bitstream. You can use partial bitstream to change 
the structure of one part of an FPGA design as the rest of the ac¬ 
tive device continues to operate. 

A Methodology for Partial Reconfiguration 

Successful implementation of a design using a partially re- 
configurable flow requires following a strict design methodology. 
A reconfigurable design will consist of partially reconfigurable 
modules (PRMs) that will be swapped in and out of the FPGA 
and the static logic, which will remain in place. The general pic¬ 
ture of the design flow involves the need to insert bus macros 
between the PRMs and the rest of the design, the static or fixed 
logic that remains in place. Bus macros are the channels or ports 
through which modules communicate and pass data. This allows 
a fixed communication channel for the static logic regardless of 
the reconfigurable logic on the other side. 

Successful design requires following the guidelines of the 
synthesis tools to generate a partially reconfigurable net list. The 
synthesis tool must be configured so that no optimizations oc¬ 
cur across hierarchical boundaries. This is generally done with a 
KEEP_HIERARCHY or similar directive. 

The next step is to use the tool to floorplan the PRMs and 
cluster all static modules together and then place the bus macros 
between the PRMs and the static logic following PRM-specific 
design rules. Finally, run the partial reconfiguration implementa¬ 
tion flow 

Plan Ahead 8.1 from Xilinx is an example of a single envi¬ 
ronment (or platform) used to manage the preceding guidelines, 
which can be broken down into the following steps: 

1. Net list import 

2. Floorplanning the design for partial re¬ 
configuration 

3. Design rule checks 

4. Net list export 

5. Implementation flow management 

6. Bitstream size estimation 

Although these steps are straightfor¬ 
ward, the methodology requires meticulous 
implementation in order to ensure success. 
Changing out a portion of a complex, high¬ 
speed design does not allow much margin 
for error. 

Use a tool like PlanAhead that works 
with imported net lists, such as those 
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Figure 3 


from XST or Synplify, to import any 
hierarchical net list (single edf/ngc or 
multiple edf/ngc files). Then follow the 
regular guidelines to import the design 
into the tool and create a fioorplan as 
you would with any non-partially re- 
configurable design. 

Floorplaning for partial reconfigu¬ 
ration is an important step in the partial 
reconfiguration fiow. Floorplanning is 
based on design partitions referred to as 
physical blocks, or Pblocks. A Pblock 
can have an area (such as a rectangle) 
defined on the FPGA device to con¬ 
strain the logic. The designer can de¬ 
fine Pblocks without rectangles and the 
implementation software will attempt to 

group the logic during placement. Net list logic placed inside of 
Pblocks will receive AREA_GROUP constraints. 

Floorplanning for partial reconfiguration entails several 
key subtasks. The first subtask is to assign an area for the 
PRM by creating a Pblock with an area defined within the 
fabric. This includes assigning the values for RANGES for the 
Pblock. The MODE constraint must be defined for all recon- 
figurable regions (MODE=RECONFIG). This constraint pre¬ 
vents the implementation tools from failing with unexpanded 
block errors during implementation of the static and recon- 
figurable modules. 

Every top-level module, other than PRMs, should be grouped 
together in a single Pblock. This is called a static logic block. This 
block should not have a RANGE defined; this will cluster the 
static logic together in a single Pblock. Select all top-level mod¬ 
ules (except the PRMs) and assign them to a Pblock. Figure 1 
shows the static logic grouped in a Pblock named AG_base. When 
the fioorplanning is completed in the design tools, the resulting 
physical hierarchy will be organized as shown in Figure 2. 

The next step is to place the bus macros. Bus macros are 
physical ports that connect a PRM to static logic. Any connec¬ 
tion from a PRM to static logic should always go through a bus 
macro. Bus macros are instantiated as black boxes in RTL and 
are filled with a predefined routing macro in the form of an .nmc 
file. Bus macros are placed on the PRM boundary. Static logic 
connected to PRMs will migrate toward the bus macro during 
placement. 

Given the complexity of the fiow, it is very common for mis¬ 
takes to be introduced in the original RTL and during the fioor¬ 
planning process. Any tool worth its salt will check for design 
violations. Also integrated into this feature, in the case of Pla- 
nAhead is the PR-Advisor, which provides feedback on how to 
improve your design. There are a number of design rule checks 
that are specific to Partial Reconfiguration. 

The bus macro DRC provides verification for all design rules 
related to bus macro connectivity and placement. One example 
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Figure 4 


Special consideration should be given to timing- 
critical paths, which include an asynchronous path. 


of a bus macro DRC is the PRBP check. This DRC checks for all 
rules that should be followed for bus macro placement. Figure 3 
shows an example of a design that failed the PRBP DRC. In this 
case, the tool tells us that the interleaved/nested macro should be 
placed at SLICE_X41Y. 

The Floorplanning DRC covers fioorplanning rules. Clock 
objects (global clock buffers, DCM) and I/Os should be placed 
and static logic clustered. The glitching logic DRC verifies glitch¬ 
ing logic elements (SRL and distributed RAM) above and below 
PRM regions. 

Another DRC is the timing advisor/DRC. This provides a 
check for timing-related issues. One example of timing DRC 
is the PRTB check. With the PRTB check, the static module 
is implemented before the PRM during the implementation 
phase. Regular timing constraints do not cover the paths that 
cross between the static and a PRM module. This does not 
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Upon successful completion of the floorplanning DRC processes, all 
modules will be displayed in black text within the Export Floorplan dialog box. 
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Figure 6 


The display of Pblock properties includes the estimated size of the bitstream. 


present a problem, provided that the bus macro is synchro¬ 
nous. However, if it is an asynchronous bus macro, the static 
module does not know about the propagating of asynchronous 
paths, as shown in the example in Figure 4. This could be 
important if these paths are timing-critical. One way to pass 
this information to the static module is to specify a TPSYNC 
constraint on the bus macro output net. PlanAhead software 
will recommend a TPSYNC constraint that can be added to 
the .UCF file. 


Once the design is fioorplanned and 
passes the DRC checker, it is ready to be 
exported. The design tools should take 
care of exporting the original hierarchical 
net list into a PR-style net list that has a 
specific format (static and PRM in sepa¬ 
rate directories). The export directory will 
appear as shown in Figure 5. Next, a par¬ 
tial reconfiguration fiow wizard, shown in 
Figure 6, runs the partial reconfiguration 
implementation on the exported design. It 
will produce a full bitstream for the com¬ 
plete design and a partial bitstream for 
each of the PRMs. The implementation 
steps are: 

• Initial budgeting 

• Static module implementation 

• PRM module implementation (one 
implementation for each version of 
every PRM) 

• Assembly and bitstream generation 
(results are stored in the merge direc¬ 
tory) 


The Pblock statistics report includes 
a section that reports PRM bitstream 
size (Figure 6). This information can be 
used for estimating the size of configu¬ 
ration memory storage such as external 
Hash and DDR. This information can 
also be used to calculate how long it will 
take to swap the module based on your 
bitstream memory interface. 

Partial reconfiguration offers a 
tremendous opportunity for designers 
looking for a way to increase the func¬ 
tionality of their design, achieve lower 
power and reduce the number of devices 
on their board. Using new design tools 
and techniques becoming available for 
partial reconfiguration applications can 
greatly simplify the complexities of jug¬ 
gling the dynamic operating environ¬ 
ment of these cutting-edge applications, 
allowing a single device to operate in 
applications that previously required multiple FPGAs along with 
the required power, board space and design overhead. □ 

Xilinx 

San Jose, CA. 

(408) 559-7778. 

[www.xilinx.com]. 
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Reduce Cost, Risk and 
Time-to-Market with an 
FPGA-to-Structured ASIC 
Strategy 

The costs, development times and risks of ASIC designs have become all but pro¬ 
hibitive. Finalizing a design on an FPGA and moving it to a structured ASIC can cut 
cost, time and risk as well as result in smaller size and lower power consumption. 


by Danny Biran 
Altera 


I I— *ew would argue that ASICs and ASSPs have become viable 
vehicles for only a few silicon vendors. As process nodes 
U continue to shrink—to 90 nm now and to 65 nm in the near 

future—not many vendors are willing to take on the risk associ¬ 
ated with an ASIC design unless they have a very high level of 
confidence that their chip will be sold in huge volumes to justify 
an investment that is often tens of millions of dollars (Figure 1). 
ASICs are under pressure. It is increasingly difficult for compa¬ 
nies to justify the development of a new chip with the very high 
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NRE price tags. If chip development cost is $30M, R&D costs 
are 20% of revenue, and one expects 10% market share, only a 
$1.5 billion market opportunity can justify the expense. There 
aren’t many markets of this size that can be serviced with a single 
product. 

One of the risks associated with building an ASIC is that its 
functionality is fixed during fabrication, resulting in the need to 
“get it right the first time,” which can often be hard to achieve. 
The cost of redesign for even a small portion of an ASIC that 
doesn’t work may be prohibitive in terms 
of both time and money. To mitigate this 
risk, FPGAs are often used as prototyping 
vehicles. With their high levels of logic 
density and performance, high-density 
FPGAs help developers implement com¬ 
plex designs that can be tested in the tar¬ 
get system. FPGA prototypes offer many 
distinct advantages, allowing a design to 
have its hardware, application software 
and firmware fully developed and tested 
at full speed without a multi-million-dol- 
lar price tag. 


0 

0.18 pm 0.15 pm 0.13 pm 
Note: Conservative estimate; does not include re-spins 

Masks & Wafers 
■ Software 


90 nm 65 nm 45 nm 

Source: International Business Strategies & Altera 

I Test & Product Engineering 
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Figure 1 


The cost of chip development goes up dramatically as technology nodes 
shrink. 


Moving to a Structured ASIC 

Once the design is prototyped in the 
FPGA, the next step is to move to an ef¬ 
fective silicon platform. Migrating from 
an FPGA to an ASIC is tricky and time- 
consuming, since there is generally a good 
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deal of work required to make sure the original design 
will still function correctly when implemented as an 
ASIC. 

A better solution is to move the verified FPGA design 
onto a structured ASIC, which has the characteristics of 
an FPGA and most attributes of an ASIC, and is signif¬ 
icantly smaller than an FPGA. A structured ASIC has 
near-ASIC performance and power consumption, along 
with a shorter development cycle and lower NRE than an 
ASIC. However, it has a much lower unit cost than a com¬ 
parable FPGA—as much as one-tenth the cost. 

Structured ASICs are comprised of prefabricated 
base arrays with predefined and verified logic, memory, 
clock networks and I/O resources. The arrays are pro¬ 
cessed through manufacturing up to a certain point and 
then “banked” for future customization. Application-specific de¬ 
signs are then configured onto the base arrays using a few top 
metal layers, thus creating the structured ASIC. This gives a 
structured ASIC a post-fabrication configurability dimension 
that is not available with an ASIC. According to Gartner Group, 
structured ASICs are rapidly gaining in popularity, with revenues 
projected to grow from $99 million in 2004 to $848 million by 
the end of 2007. 

Designers should be aware that different structured ASIC 
vendors implement their architectures in very different ways, 
even with different families from the same vendor. Parameters 
such as number of user-definable layers, number of equivalent 
ASIC gates, memory capacity, maximum clock rate and target 
processes are very different from vendor to vendor, making the 
migration from FPGA of one vendor to structured ASIC of an¬ 
other, while easier than to an ASIC, still difficult and fraught 
with potential errors that can add time and cost to reaching pro¬ 
duction silicon. 

The Key for Structured ASIC Success 

If the same silicon vendor produces both silicon platforms, 
moving a design from a system-verified FPGA to a structured 
ASIC, while maintaining functionality and meeting timing con¬ 
straints, is greatly simplified and provides the customer with a 
low-cost and safe “path to production” strategy. A one-to-one 
mapping between the FPGA and structured ASIC can eliminate 
redesigning the board, and redeveloping and revalidating the sys¬ 
tem, resulting in significant development cost savings and time- 
to-market benefits. Proven IP cores, common to both platforms, 
a single design flow, common EDA tools and pin-to-pin compat¬ 
ibility between the FPGA and the structured ASIC assure that 
the smaller, less expensive structured ASIC will work correctly 
in the application. 

The costs associated with a structured ASIC are much lower 
than those for an ASIC. For a 90-nm design with, say, 2.2 mil¬ 
lion ASIC gates, 9 Mbits of SRAM and 1.5 million gates for DSP 
and multipliers, NREs are often as low as $225,000 to $300,000, 
which is significantly lower than just the cost of a complete mask 



the Stratix II FPGA. Using ALMs instead of traditional 4-input 
look-up tables (LUTs) increases logic-utilization efficiency and 
performance. 

set for an ASIC. Designs prototyped and verified in an FPGA 
and then migrated to a structured ASIC can be used for all but the 
very highest performance and/or cost-sensitive applications that 
would justify the cost optimization realized by a full ASIC. 

When migrating from an FPGA to a structured ASIC, the 
vendor can eliminate a lot of the circuitry from the FPGA that 
is not required for normal chip operation. The removed circuitry 
includes FPGA configuration logic, programmable routing and 
logic and memory programmability. What needs to be added to 
the structured ASIC is embedded testability, since the circuitry 
needed to test the structured ASIC is much different than that 
for the FPGA. Eliminating all the extra transistors results in a 
much smaller chip with a corresponding unit cost reduction of as 
much as 90%. The structured ASIC also sees a significant power 
reduction. The shorter interconnect paths result in lower dynamic 
power dissipation and the structured ASIC has lower static power 
consumption with the elimination of the many transistors used 
for configuration and programming on the FPGA. 

A key to successful FPGA and structured ASIC migration, 
even when both silicon platforms are from the same vendor, is 
how well the basic logic building blocks of the structured ASIC 
implement the logical functions of the FPGA. This is not a sim¬ 
ple task, and requires a lot of work on the part of the vendor who 
has developed both platforms. 

For example, the logic structure of the Stratix II FPGA ar¬ 
chitecture comprises basic logic units known as adaptive logic 
modules (ALMs). As shown in Figure 2, each ALM contains 
a variety of look-up table (LUT)-based resources, two full ad¬ 
ders, carry-chain segments, two fiip-fiops and many additional 
logic enhancements that can be divided into two adaptive LUTs 
(ALUTs). One ALM can implement logic functions with up to 
seven inputs and complex logic-arithmetic functions, increasing 
logic efficiency and reducing routing resources. 

The HardCopy II structured ASIC family comprises an ar¬ 
ray of fine-grained structured cells called HCells that are grouped 
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Figure 3 



If sections of an ALM are not used in the FPGA design, then 
they are not mapped to the structured ASIC. This increases 
the efficiency of the FPGA-to-structured ASIC operation. 


RadHard Eclipse FPGAs 
are now in production 


Proven commercial 
architecture 

Efficient FPGA fabric 
Low power 

On-chip RadHard 
SRAM 

Rapid prototyping 
capabilities 

Guaranteed perform¬ 
ance to 300Krad(Si) 

ITAR free 



into HCell macros to implement a portion of an ALM or a section 
of a DSP block. The design software maintains a library contain¬ 
ing a pre-verified, pre-characterized HCell macro for every ALM 
configuration, which then maps the ALMs into a structured ASIC 
design. The tool only maps the utilized portion of each ALM to 
HCell macros (Figure 3); if parts of an ALM are not used in the 
FPGA design, then they are not mapped to the HardCopy II de¬ 
vice, yielding a more efficient mapping of the prototyped design. 

When compared to a corresponding Stratix II device, a 
HardCopy II structured ASIC is 50% faster, dissipates up to 70% 
lower core power and has a 60-85% smaller chip size. 

A HardCopy structured ASIC also offers time-to-market ad¬ 
vantages. The typical time it takes an ASIC/AS SP to go from 
design initiation to completion is around two years. During that 
time, market opportunity for the chip can be severely hindered or 
it can even disappear. Taking a design to a structured ASIC typi¬ 
cally takes half that time or less, allowing the product to reach 
the market significantly sooner than if it were implemented as an 
ASIC or ASSP. 

The secret to making FPGA design an integral part of product 
development beyond using the FPGA only as a prototype device 
is to provide a clear and low-risk path from 
the FPGA to a production-viable silicon 
platform, such as a structured ASIC. With 
near-ASIC performance and cost, struc¬ 
tured ASICs provide a production silicon 
solution for a broad range of applications 
that are currently filled by expensive and 
unreliable (in terms of cost and develop¬ 
ment schedule) ASICs and ASSPs. How¬ 
ever, with so many different structured 
ASIC architectures and business models 
available, an important factor in success¬ 
ful FPGA-to-Structured ASIC migration 
is re-mapping the logic functions of one 
platform to the other as opposed to doing 
an expensive and risky architectural con¬ 
version. This is accomplished by working 
with a vendor who has developed both the 
FPGA and structured ASIC architectures, 
along with the EDA tools a designer uses 
to develop and verify the target design on 
the FPGA and then migrate the design to a 
structured ASIC. □ 

Altera 

San Jose, CA. 

(408) 544-7000. 

[www.altera.com]. 
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Synplify® DSP 

A Breakthrough in DSP Synthesis 



Synplicity’s Synplify DSP synthesis solution 
offers DSP designers an efficient automated 
path from Simulink® to optimized RTL code 
And, unlike other solutions, Synplify DSP 
software provides technology independent 
device targeting and advanced synthesis 
optimizations for performance and area 


Synplify DSP Software Uniquely 
Offers: 


A User-extensible DSP library: quickly 
create and re-use custom functions 


Sample-rate-based synthesis engine: capture 
and explore algorithm behavior without 
worrying about target hardware 


Retiming: automatic pipelining and register 
insertion for higher speeds 


Folding: automatically apply resource 
sharing logic for lower area (device cost) 


Multichannelization: automatically apply 
resource sharing across multiple channels 
for lower area (device cost) 


For more information on Synplicity’s Synplify DSP solution and all of 


Synplicity’s offerings, please visit our website at www.synplicity.com 
or contact info(^synplicity.com 


Synplicity 
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Moving Real-Time Data around 
FPGA-Centric Systems 

Support for real-time FPGA networks is emerging from board level vendors. 
As larger FPGAs and solutions that are more sophisticated are required, this 
will become increasingly important. 


by Jeremy Banks 
VMetro 


computationally intensive applications often in¬ 
volve real-time high-bandwidth data streams 
' coupled with fast signal processing. This increas¬ 
ingly involves using multiple FPGAs because they are 
faster than CPUs or DSPs. However, coupling FPGAs 
(or IP cores) can be difficult and can consume unneces¬ 
sarily large amounts of FPGA resources leaving fewer 
resources for the actual signal processing. Since the data 
movements can occur in parallel, the network commu¬ 
nications can also be complex (Figure 1). The ability to 
establish links between FPGA IP cores, no matter where 
they are physically located, is important. 

The trend with CPU data communications is to move 
away from parallel bus structures in favor of multiple se¬ 
rial point-to-point data links. This makes systems easier to build 
and improves system performance because CPUs do not have to 
share a bus—instead there are dedicated communication paths that 
optimize the system communication. This also improves determin¬ 
ism by reducing the data traffic and the number of data sources on 
a link, an essential requisite for real-time solutions. 

The most popular serial fabrics used for point-to-point com¬ 
munications include PCI Express, Serial RapidIO, Gigabit Ethernet 
and InfiniBand. High-speed serial communication (HSSC) using 
these fabrics is becoming the backbone of modular high-perfor¬ 
mance processing solutions through the adoption of standards such 
as VXS (VITA 41) and VPX (VITA 46) (Figure 2a and 2b). Boards 
and systems built using HSSC are providing high-density, tightly 
coupled FPGA and CPU solutions with high-bandwidth I/O. Such 
solutions are well suited to real-time applications. 

Protocol-rich HSSC fabrics work well for CPU-centric systems, 
but for FPGAs, these fabrics are a luxury that sacrifices resources to 
implement complex communications. For systems that include both 
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FPGAs and CPUs, using CPU-centric fabrics at the CPU-FPGA 
boundary makes sense. However, for direct FPGA-to-FPGA com¬ 
munications, an alternative is required in both the type of fabric and 
how it is used—after all, an FPGA is not a CPU, and treating it as 
such will dilute the true benefits offered by FPGAs. 

Data Flow 

A common way to design a system in its early conceptual 
stages is to draw a block diagram. In doing this, the designer is de¬ 
scribing the major processing blocks and how data moves through 
them. Memory mapping is a concept that CPUs use to arrange 
and process their data. Memory-mapped fabrics, such as PCI Ex¬ 
press, fit well with CPU-centric processing models. Implementing 
the conceptual design onto a CPU requires “converting” the block 
diagram into the CPU’s memory-mapped model. 

By contrast, an FPGA can be used to lay out the block dia¬ 
gram “as is,” parallel fiows of data can be the same as the block 
diagram, as can the processing blocks. Along with increased per¬ 
formance, this difference of one-to-one mapping is a key advan¬ 
tage of FPGAs. For communication, a fiow of data, or stream, 
is a better model to use with FPGAs than memory maps. How- 
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Figure 2 


Example VXS board-based solution linking multiple FPGAs together over HSSC links (a), and examples of board-level 
components, a VXS switch, VXS FPGA and FPGA PMC with FISSC links (b). 


ever, implementing the block diagram and its data flows on an 
FPGA presents some practical limitations. Even assuming all the 
processing blocks (or IP) are available, there is still the issue of 
whether all of this functionality can be fitted onto a single FPGA 
device. If not, can the parallel and independent data flows between 
processing blocks on different FPGAs be maintained and operate 
as if they are processing blocks on the same device? 

Imposing CPU-style fabrics to manage the data flow would 
be a step backward because it would mean handling the data in 
different ways than those needed by the FPGA and would no lon¬ 
ger represent a “flow.” However, simple real-time communications 
networks based on data flows are becoming available. In their sim¬ 
plest form, these real-time networks need reliable data links and a 
tag to describe which processing block the data is to be delivered 
and, ideally, what the data represents. Dedicated real-time com¬ 
munications network IP cores can handle all of this work without 
having to implement memory-mapped fabrics. 

Real-Time Communications Networks 

Real-time systems are deterministic; for a given event, any 
event, a real-time system must respond within a guaranteed pe¬ 
riod. This is a common statement dictating what a real-time, de¬ 
terministic system has to be able to achieve. When FPGAs are 
used for processing because of their speed, the ability of the net¬ 
work to deliver data within guaranteed constraints is even more 
important. Why use a device for performance if you cannot get 
data to it in the first place? For a system to be deterministic, the 
communications network must also be deterministic. This can be 
further complicated if there is a need to provide support for mul¬ 
tiple data streams across a few links. This is a potential bottleneck 
and threat to real-time performance. 

What makes a network deterministic? If the network can reli¬ 


ably deliver its complete payload within a guaranteed time period, 
no matter what is happening elsewhere on the network, it can be 
considered deterministic. For dedicated point-to-point connec¬ 
tions, this is straightforward. However, what about packet switch 
connections where the data hops through many devices before 
arriving at its destination (Figure 3)? Or, what about situations 
where multiple data streams must share physical links? 

Non-blocking switches are part of the answer, but there is no 
substitute for determinism by design, a higher-level system concept. 
One way to do this is to ensure that all data paths through the sys¬ 
tem are known (ideally fixed) with dedicated, allocated and guar¬ 
anteed bandwidth—all the way through the system. If switches are 
used, they should be non-blocking on all data channels. 

For properly designed real-time networks, the protocol can 
remain relatively simple and low level—ideal for FPGAs because 
fewer resources are used. Since data paths between nodes are 
known, features such as out of order data packet handling is un¬ 
necessary (as used by TCP/IP) as is flow control (other than mov¬ 
ing data between clock domains using simple FIFOs). To a certain 
extent, even error correction (error detection is still important) 
can be simplified for real-time systems, otherwise, the question of 
what to do with the error is raised. 

By definition, if the data has errors in the first place, then resend¬ 
ing it is no guarantee that it will be correct the next time. If the data 
connection failed (errors occurred or no data at all was received), why 
shouldn’t it fail repeatedly. This indeterminate situation is not good for 
a real-time system. The solution is usually a system design parameter 
of being able to live with the errors. For an imaging application, the 
processors may make the decision to throw away the data if it is useless, 
carry on and resynchronize. For FPGA-based designs, this simplifica¬ 
tion of the network protocol saves a large amount of FPGA resources. 



is not a direct physical data link. 


Efficient FPGA-Based Fabric 
Communications 

Clusters of FPGAs that link streams of data 
between devices in real time need efficient point- 
to-point data links and protocols. Using protocols 
such as PCI Express consumes large amounts of 
FPGA resources. A PCI Express x4 core could 
account for as much as 30-40% of a Xilinx 
XC2VP50’s resources. By contrast, a much sim¬ 
pler protocol, such as Serial FPDP uses as little 


58 May 2006 































FPGAs: The New Matrix for Design 



as 1% of the resources (for a xl channel) of the same size FPGA. But 
Serial FPDP was developed as a simple sensor interface rather than 
a network protocol (Table 1). However, the Aurora protocol, devel¬ 
oped by Xilinx, is optimized for FPGA-to-FPGA communications. 
By comparison, Aurora uses around 4% of an XC2VP50 FPGA; this 
is for a x4 channel (less for a lx channel) and includes lane alignment, 
error detection and flow control—both user-deflned and native. 

Serial protocols using multi-Gbit/s data links are an efficient 
use of I/O pins for an FPGA. However, even the largest FPGA 
devices have no more than 20-30 full duplex serial links, or if used 
as x4 links for higher bandwidth, there are only four to six chan¬ 
nels. If that is the case, what about large clusters of FPGAs with 
multiple data types and high-connectivity requirements? How are 
the limitations in the number of connections resolved? For these 
scenarios, the data links have to be shared with the different data 
flows between or through devices using a real-time network. In ef¬ 
fect, the network must support bridging through an FPGA so that 
logical data paths can be established anywhere around the system. 

The problem with the simple point-to-point protocols devel¬ 
oped for FPGAs is that they do not handle bridging via an interme¬ 
diary FPGA. Higher-level protocols are required that understand 
what data is targeted at them and if it is not, where it should go. If 
the data flow is self-identifying, then it should be straightforward for 
the real-time network to handle this. However, it is not something 
that the developer wants to deal with; there is an increased expecta¬ 
tion that this is something the system vendor should be providing. 

A Developer’s Perspective 

Handling FPGA communications in the development of an 
FPGA-centric design can be complex. While the performance ad¬ 
vantages of FPGAs are well understood, implementing the network 
communications efficiently is critical for a successful real-time 
system. Ideally, a developer would like to have a toolkit of firm¬ 
ware IP and software components, just as they have for processing 


Communications IP Core 

Approximate resource usage of 
XC2VP50 FPGA 

PCI Express x4 

30-40% 

Serial FPDP (x1) 

1% 

Aurora (x4) 

4% 


Table 1 FPGA Core usage for examples of communications protocols. 


blocks. These components must be provided in such a way that 
only the components that are needed are included at compile time 
for optimal solutions, rather than generic code blocks that cater to 
all situations. This saves valuable FPGA resources by removing 
unnecessary IP cores for data channels, DMA controllers, etc. 

The ideal components to do this include communication link con¬ 
trollers, which maintain the physical interfaces such as a high-speed 
serial communications link, or parallel LVDS ports with the ability to 
support virtual data streams, non-blocking switches and simple inter¬ 
faces for IP to link into the fabric, etc. Such toolkits are now becom¬ 
ing available from companies such as VMetro with its TransComm 
firmware and software tools (Figure 4.). With such toolkits, creating 
a real-time communications fabric for FPGAs, perhaps included on 
analog input designs, becomes much easier with reduced risk. 

Developers want to harness the power of FPGAs and focus on 
their own expertise, the application IP, not the network communica¬ 
tions. Support for real-time FPGA networks is emerging from board- 
level vendors. As larger FPGAs and solutions that are more sophisti¬ 
cated are required, this will become increasingly important. D 

VMetro 
Houston, TX. 

(281)584-0728. 

[www.vmetro.com]. 
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Design Methodologies 
Help Leverage IP for 
FPGA-Based Development 

For FPGA-based designs today, the spectrum of IP available on the market, 
common implementations in industry and fine tuning with technologies under 
the hood make leveraging IP a strong consideration. 

by Jeff Harriman, Xilinx 

Jeff Meisel, National Instruments 


ode reuse in FPGA designs offers advantages such as a re¬ 
duction in overall software development costs and a shorter 
time-to-market, both of which are well documented. A fea¬ 
ture sometimes overlooked, however, is the tremendous design 
flexibility gained when using intellectual property (or IP blocks) 
through recent technological advances in compiler optimization, 
place and routing, and veriflcation tools. 

As FPGAs have become faster and more powerful, the amount 
of work required of them has also grown. Years ago, an FPGA was 
expected to provide some essential glue logic to tie a board together. 
But today, the trend is to pull tasks historically handled by dedicated 
ASICs into the heart of the FPGA design. The increase in work han¬ 
dled by FPGAs can be attributed to the giant leaps in technology that 
have pushed speed and size to new levels. However, the challenge is ef¬ 
fectively using an FPGAs fabric resources, which requires a high-level 
understanding of the architecture and routing process. Luckily, FPGA 
vendors and third-party experts offer building blocks in the form of IP 
blocks (also called IP cores) to simplify the design process. 

IP cores cover an unlimited spectrum ranging from basic 
functions to extremely complex design blocks. Vendors and third 
parties offer cores including networking interfaces, system I/O in¬ 
terfaces, communications blocks, digital signal processing (DSP) 
functions and external memory interface controllers, as well as a 
suite of IP for embedded systems. These general categories fea¬ 
ture several predeflned functions such as those listed in Table 1. 
Cores, which can typically be parameterized to meet the speciflc 
needs of a design, are optimized to take advantage of the features 
of the FPGAs for which they were designed. Most cores from 
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FPGA vendors come with documentation similar to stand-alone 
ASIC data sheets. You can see the growth in the popularity of 
code reuse by the growing online community of developers shar¬ 
ing open-source IP through Web sites such as OpenCores.org. 

System requirements can change over time. You should ex¬ 
pect and plan for this at the outset of your design. Figure 1 shows 
how a hardware developer of a DSP board could implement an 
FPGA-based design built almost entirely on IP blocks. The block 
highlighted in red, which represents the “in-house IP” that the 
company creates to differentiate itself from the competition, is 
implemented as a processor core or as state machine logic. The 
company also must implement a communication protocol among 
blocks, so data can be bused between the different components. 

For instance, you can recycle a DSP application involving a 
speciflc chain of Alters with different interfaces. Once you have 
developed an algorithm to perform the desired signal processing, 
you can alter the means by which data is presented to the Alter 
relatively quickly, from an Ethernet interface to a PCI Express 
interface, without having to redesign the signal processing chain 
or the bulk of the data transfer interface. 

In the current technology environment, cutting-edge designs can 
quickly become outdated. With IP, you can save development time and 
avoid chasing after technology trends. For example. National Instru¬ 
ments uses the PCI Express LogiCORE from Xilinx to migrate its PCI- 
based data acquisition systems to the newer PCI Express standard. 

The design flexibility that IP brings to the table also provides you 
with the ability to use higher-level tools for rapid system development 
by simply dropping in an IP block. The perceived trade-off for going 
to a higher level of abstraction in programming languages is that an 
easier development experience comes at the cost of code optimization. 
However, because IP is often pre-compiled and optimized for a partic- 
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ular FPGA, the cost of going to 
higher-level tools is negligible. 

LabVIEW FPGA and Xilinx 
System Generator for DSP are 
examples of two such tools. 

Figure 2 illustrates the 
simplicity of a LabVIEW 
FPGA application that you can 
use to implement a LogiCORE 
FIR filter from Xilinx for tar¬ 
geting a Virtex-II FPGA found 
on National Instruments com¬ 
mercial off-the-shelf (COTS) 
hardware. The parallel nature 
of graphical programming maps intuitively to an FPGA. 


Audio, Video, and 

Image Processing 

JPEG Encoder 

MPEG Encoder 

2-D Discrete Cosine Transform 

Basic Logic 

Comparators 

Multiplexers 

Counters 

Bus Interface and 1/0 

SPI, FC 

PC/104, USB 

CAN 

Communication and 
Networking 

Interleaver 

Gigabit Ethernet Controller 

Viterbi Decoder 

Digital Signal Processing 
(DSP) 

EFT, FIR, HR, Reed-Solomon 

Radar Pulse Compression 

Digital Downconverter 

Digital Upconverter 

Math 

Multiplier Accumulator 

Discrete Wavelet Transform 

Sine Cosine Look-Up Table 

Memory Interface and 
Storage Element 

Dual-Port Memory 

SDRAM Controller 

Asynchronous FIFO 


Table 1 Example IP blocks available from FPGA vendors and third parties. 


Under the Hood 

Developing intellectual property source code in an organized, 
modular and hierarchical fashion results in faster development, bet¬ 
ter performance and better code reuse in the long run. The indi¬ 
vidual function blocks, which serve a well-defined purpose, have 
clearly defined inputs, outputs and parameters. Because the block 
is well defined, you can more easily avoid negative feature creep— 
adding unrelated features that are specific to a single use case. In¬ 
stead, you can add features over time that make the function block 
more general-purpose to suit a wider array of applications. This is 
“positive” feature creep. Additionally, you can test, validate and op¬ 
timize function blocks on an individual basis. Knowing these key 
IP development principles can help you develop your own in-house 
IP or use commercially available blocks. 

No matter the tool flow, the same design principles apply 
when optimizing your IP for a particular device. The key here is 
a solid understanding of the tools at your disposal, including the 
most effective ways to use them. This translates to deep architec¬ 
ture knowledge of your target FPGA. The standard features of pro¬ 
grammable logic devices now include modular building blocks that 
improve size and performance for the majority of designs. Various 
device families include dedicated 18x18 multiplier blocks, 18 Kbit- 
lock memory, digital clock managers, FIFOs and a DSP48 slice to 
complement the rest of the configurable FPGA fabric. Understand¬ 
ing the capabilities of these blocks and the routing resources avail¬ 
able helps you maximize your design performance. 

While you can often infer these blocks using synthesis tools 
with generic code, you need to ensure their optimal 
use. A good example of this is a custom multiply- 
accumulate (MAC) FIR filter. A MAC FIR is com¬ 
prised of storage elements, control logic, a multiplier _ 

and an adder. The specs for such a design should, ADC/DAC 
at a minimum, include the data sample rate and the 
desired frequency response. At this point, take stock 
of the requested specifications and observe how they 
map to your hardware with regard to your target ar¬ 
chitecture. What is the ratio between the sample rate 


and the system clock? How many taps do you need and what bit 
width should you use taking quantization into account? 

When you answer these questions, do not just consider the algo¬ 
rithm. Also examine the building blocks already available. In a Vir- 
tex-4, the DSP48 slice is ideal for performing MACs with a system 
clock running up to 500 MHz. Once you determine how many coef¬ 
ficients you need, compare that to the different memory resources 
available. The four input look-up tables map well to memories with 
depth increments of 16. However, as distributed memories become 
large, the speed they can run at decreases, and memory can become 
a bottleneck. At what point should you switch over to a dedicated 
block RAM? The answer to that question depends on your perfor¬ 
mance and resource requirement balance. In a predefined IP block, 
you never have 100 percent fiexibility to control all of these details. 

Tweaking the Tools 

Once you have written the code and verified it behaviorally, 
you still have several steps left that influence the final hardware 
results. If you have been careful to map your design well to the de¬ 
vice primitives, you should not need to iterate through the source 
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Figure 1 


System-level diagram of an IP-centric FPGA design. 
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Figure 2 


Example of using an IP block in a high-level 
programming environment (LabVIEW FPGA). 
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code. To maximize the speed performance and minimize the area 
at this stage, you must know which buttons to push in synthesis and 
be familiar with the back-end tools. In a large design, hierarchy is 
important for organization. You can keep the boundaries between 
modules intact or you can allow the tools to blur them. 

Depending on the blocks that make up your design, maintaining 
the hierarchy for some parts of the design but not others may be wise. 

When a core has area or placement constraints 
that are necessary to meet a specified perfor¬ 
mance, retaining hierarchy allows the con¬ 
straints to implement their intended functions. 
In other parts of the design, optimization across 
hierarchical boundaries helps the tools peer in¬ 
side the netlist making up the core and look for 
ways to optimize signals that transmit between 
it and other parts of the design. 

You should approach timing closure in a 
similar way for designs involving custom func¬ 
tions or designs heavy in third-party IP. Reg¬ 
ister balancing and hierarchy optimization in 
the early stages help the place and route tools 
achieve your desired performance. Occasion¬ 
ally, you need additional placement constraints 
to meet all timing constraints. Technological 
advances make it easier to push the maximum 
performance of a design to higher levels. This 
means you can focus on the functionality and 
let the back-end tools find the optimal place¬ 
ment for registers within a combinatorial path. 
These tools also use intelligent algorithms to 
determine when register duplication provides 
an additional speed boost. 

Understanding the tools available to you as 
well as your hardware capabilities helps you pro¬ 
duce efficient and robust designs. While creating 
custom functions and interfaces is part of any 
design, keeping reuse in mind helps you protect 
the investment on work you have already done. 
High-level tools can assist you in implementing 
this concept when using hierarchical and modu¬ 
lar design fiows. Finally, with IP cores, you can 
quickly create complex designs when your re¬ 
sources and time are limited. D 
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Embedded FPGA Soft Core 
Processor Enables Universal 
CompactPCI Applications 


by Pat Mead, Altera 

Barbara Schmitz, MEN Mikro Elektronik 


rogrammable logic has reached such a state of advance¬ 
ment in terms of speed and density that it has become a 
truly attractive alternative to RISC and CISC processors. 
It can form a “matrix” within which processing, peripherals, 
data paths and algorithms can be placed to create powerful, 
flexible and upgradable systems. Programmable logic is now 
available in forms and sizes that range from the traditional 
use as glue logic up to structured ASIC replacements and even 
further. To date, to fully use the advances of this key technol¬ 
ogy—re-programmability, reusability and upgradability—you 
need to be a FPGA expert, but these benefits need to be opened 
to a much wider market. 

A Nios-II-CompactPCI development package designed as an 
open FPGA platform includes a sample design with a PCI system 
unit, integrating the standardized Wishbone bus and the Altera 
Avalon switch fabric. The PCI system unit forms the interface 
to the PCI bus, where the CPU board can then be addressed as a 
PCI slave. It connects to the Wishbone bus where a SDRAM and 
a flash controller are already implemented. 

The 3U CompactPCI card with a Cyclone FPGA and the 
integrated Nios II microcontroller soft core is designed for final 
use in volume in production and it acts at the same time as the 
standard FPGA development platform for this application (Fig¬ 
ure 1). As a universal FPGA platform, the board has a multitude 
of directly accessible I/O pins. The Nios II CPU in the Cyclone 
FPGA provides performance similar to an ARM processor. It 
allows the use of the CPU board, for example, as an intelligent 
slave on the CompactPCI bus. The FPGA and the integrated pro¬ 
cessor core support a 32-bit bus with 33 MHz, control 32 Mbyte 
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SDRAM and support read and write to and from the 2 Mbyte 
flash memory. The special flash structure provides initial pro¬ 
gramming using a boundary scan interface. Once configured, the 
FPGA may be reconfigured at any time during operation with 
data from the CompactPCI bus. 

The FPGA also controls four status LEDs and up to 83 user- 
defined I/O pins. The final functionality of the board depends 
entirely on the application and can be anything from a simple 
UART solution up to a complex analog front end with DSP-like 
data pre-processing. In any case, the CompactPCI card supports 
a nearly endless range of applications. The designer can use IP 
cores to configure the function. This includes different serial in¬ 
terfaces from RS-232 to intelligent HDLC protocols up to Fast 
Ethernet. Other functions include graphics, fleldbus connections 
or digital I/O. On the other hand, a development package included 
allows the user to create custom cores or to integrate third-party 
cores from opencores.org or from Altera. 

Open-Platform Development Concept 

The user can now add any kind and number of IP cores to 
the Wishbone bus. To do this, a “Wishbone Bus Maker Tool” 
has been developed, which can be used to generate the Wish¬ 
bone bus and which is part of the development package. The 
Wishbone Bus Maker can generate multi-master and multi-slave 
bus systems. A Wishbone-to-Avalon-bridge, and vice versa, an 
Avalon-to-Wishbone-bridge (Figure 2) allow the additional in¬ 
tegration of Avalon-based IP cores and especially of the Nios II 
soft processor core. Nios II connects to the Avalon switch fab¬ 
ric, where a GPIO module for the user LED control is already 
implemented as well. The user can now also add any kind and 
number of IP cores to the Avalon switch fabric by using the 
SOPC Builder tool from Altera, which is part of the Quartus II 
development package. 
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The open-platform concept for Nios-based develop¬ 
ment and integration of IP cores for Wishbone and the 
Avalon switch fabric can also be expanded on the hard¬ 
ware side. Currently a new generation of ANSI standard 
PMC and M-Modules is being designed that uses the Cy¬ 
clone II family. The entire board logic including the indi¬ 
vidually configurable FPGA is located on the base PMC or 
M-Module, while the physical interfaces are implemented 
on an adapter card, which can be plugged either on the PMC 
or the M-Module. Since both PMC and M-Modules can be 
used on all kinds of system platforms using dedicated car¬ 
rier boards, the user can now concentrate completely on the 
FPGA application. 

Flexibility—Base for 
Comprehensive Applications 

An example application is now in use in 
the automated driverless underground. A leading 
manufacturer is using standard 19” systems. These 
redundant built-up 3U-CompactPCI systems feature a 
Pentium III CPU, analog and digital I/O, sensors for position 
encoders and an optional MVB link. A feedback channel al¬ 
lows sending back data from the vehicle permanently to the 
central control station. 

FPGAs are used in three different boards. The FPGA of the 
CPU board contains a watchdog and different UARTs; the NAND 
fiash is controlled by the Nios II microcontroller. A second board 
features eight UARTs completely integrated in the FPGA used 
for asynchronous RS-422 operation and optional synchronous 
HDLC (without Nios). The third board features digital I/Os, ana¬ 
log outputs, counter pulses, radar sensor and interrupt inputs—all 
implemented in the FPGA hardware without using the Nios II. 
This standard CompactPCI system can be configured for the 
requirements of the operation of the underground trains in dif¬ 
ferent cities and countries by only changing the content of the 
FPGA—the hardware remains the same. 

Modern low-cost FPGA components have a usable size. 
The Cyclone II family from Altera contains nearly 70,000 logic 
elements and their pricing is acceptable starting from just a few 
dollllars for the smaller devices. This makes them effective 
factors for cost savings and time-to-market when making indi¬ 
vidual configurations of standard products. A time-consuming 
and expensive redesign of a board can often be avoided through 
application-specific integration of IP cores in the FPGA. Fur¬ 
thermore, FPGA technology is indispensable wherever long¬ 
term availability or harsh industrial environments are involved. 
IP cores per se are not threatened by discontinuation, even if 
an FPGA component may be replaced by a newer one after 10 
years, for instance. 

The Nios II family of 32-bit RISC embedded processors 
delivers more than 100 DMIPS of performance when imple¬ 
mented in the Cyclone II family. Because the processors are 
soft core and fiexible, it is possible to choose from a nearly 



The F206N from MEN Mikro is an example of a 3U 
CompactPCI board based on Altera Cyclone II. The 
functionality of the board is entirely dependent on 
the IP programmed into the FPGA. 
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Bridges implemented inside the FPGA allow 
integration of standard IP cores along with Altera IP, 
including the NIOS II soft core processor connected 
to Altera’s Avalon switch fabric. 
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unlimited combination of system configurations to meet the re¬ 
quired performance, features and cost. The Nios II processor 
family consists of three cores—fast (Nios Il/f), standard (Nios 
II/s) and economy (NIOS Il/e)—each optimized for a specific 
price and performance range. All three cores share a common 
32-bit instruction set architecture and are 100 percent binary 
code compatible. A library of commonly used peripherals and 
interfaces is included in the Nios II development kit. A com¬ 
plete list of SOPC builder-ready intellectual Property (IP) and 
peripherals can be found at the Altera Web page. Using the 
interface-to-user-logic wizard in the SOPC Builder software, 
enables the creation of custom peripherals and their integration 
into Nios II processor systems. 

Optimized Use of Bus Resources 

The Avalon switch fabric enables multiple, simultaneous data 
transactions for maximum system throughput. SOPC Builder au¬ 
tomatically generates an Avalon switch fabric optimized to the 
specific interconnect requirements of the final system processors 
and peripherals. In traditional bus architectures, a single arbiter 
controls the communication between the bus master and slaves. 
Each bus master requests control of the bus, and the arbiter then 
grants bus access to a single master. If multiple masters attempt 
to access the bus at once, the arbiter allocates bus resources to a 
master based on a fixed set of arbitration rules. This can lead to 
a bandwidth bottleneck as only one master can access the system 
bus and its resources at a time. 

The Avalon switch fabric’s simultaneous multi-master ar¬ 
chitecture increases the system’s bandwidth by eliminating this 
bottleneck (Figure 3). Using the Avalon switch fabric, each bus 
master gets its own dedicated interconnect, meaning that bus 
masters only contend for shared slaves, not for the bus itself. 
Each time a component is added or the peripheral access priori¬ 
ties change, SOPC Builder generates a newly optimized Avalon 
switch fabric with a minimum of FPGA resource use. The Ava¬ 
lon switch fabric supports a wide range of system architectures. 


including single- and multiple-master systems, and allows 
seamless data transfers between peripherals with performance- 
optimized data paths. Off-chip processors and peripherals are 
equally well supported. 

Custom instructions allow developers using Nios II proces¬ 
sors to increase system performance by extending the CPU in¬ 
struction set to accelerate time-critical software. Using custom 
instructions enables the optimization of system performance in 
a way not possible with traditional off-the-shelf processors. The 
Nios II family of processors supports up to 256 custom instruc¬ 
tions to accelerate logic or mathematically complex algorithms 
normally handled in software. For example, a block of logic that 
performs a cyclic redundancy code calculation on a 64 Kbyte 
buffer operates 27 times faster as a custom instruction than when 
performed by software. 

Nios II processors support fixed and variable cycle opera¬ 
tions, include a wizard for importing user logic as a custom in¬ 
struction, and automatically create software macros for use in 
developers" code. Large blocks of data can be processed concur¬ 
rently with CPU operation by adding application-specific hard¬ 
ware accelerators that act as custom co-processors within the 
FPGA. Using the cyclic redundancy code example, processing 
a 64 Kbyte buffer runs 530 times faster with hardware accelera¬ 
tors than software. SOPC Builder includes a wizard that allows 
developers to add their acceleration logic and DMA channel to 
the system. 

A complete set of tools is available for the hardware design, 
including the SOPC Builder system development tool, Quartus 
II design software, ModelSim-Altera software and SignalTap II 
embedded logic analyzer. Hardware design for creating Nios II 
processor-based systems uses the SOPC Builder system develop¬ 
ment tool to specify, configure and generate systems. Launch¬ 
ing from within the Quartus II design software, SOPC Builder 
provides an intuitive wizard-driven graphical user interface for 
creating, configuring and generating system-on-a-programma- 
ble-chip (SOPC) designs. 
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To make the software design flow as easy as possible, it is 
possible to accomplish all software development tasks within 
the Nios II IDE, including editing, building, debugging pro¬ 
grams and flash programming. As part of the Nios II IDE, 
Altera partners with operating system and middleware provid¬ 
ers for additional software development tools. A PC, an Altera 
EPGA device and a JTAG download cable is everything you 
need to develop and debug Nios II proces¬ 
sor-based systems. The Nios II architecture 
supports a JTAG debug module that pro¬ 
vides on-chip emulation features to control 
the processor remotely from a host PC. 

The IDE can communicate with the JTAG 
module on one or more processors. This 
allows downloading programs to memory, 
starting and stopping program execution, 
setting breakpoints and watch points, ana¬ 
lysing registers and memory and collecting 
real-time execution data. 

The instruction set simulator (ISS) 
makes it possible to begin developing pro¬ 
grams before the target hardware platform 
is ready. Many designs also incorporate 
flash memory on the board. Therefore any 
CEI-compliant flash device connected to 
the EPGA can be programmed using the 
IDE flash programmer. The flash program¬ 
mer is pre-conflgured to work with all of the 
boards available with the Nios II develop¬ 
ment kits, and can be easily ported to any 
custom hardware. In addition to a project 
set-up wizard, the IDE provides software 
code examples, in the form of project tem¬ 
plates, to help bring up working systems as 
quickly as possible. 

The IDE enables quick system 
customization using system software. The 
hardware abstraction layer (HAL) library 
is a lightweight runtime environment that 
provides a simple device driver interface for 
programs to communicate with underlying 
hardware. MicroC/OS-II from Micrium is a 
complete, portable ROM-able, pre-emptive 
real-time kernel, shipped with all develop¬ 
ment kits and includes full source code, ref¬ 
erence manual and free developers’ licence. 

Included in the development kit is also an 
open-source IwIP TCP/IP stack that is built 
to work with the MicroC/OS-II applications 
and implements the standard UNIX socket 
API as well as a full-featured LINUX oper¬ 
ating system. □ 
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FGPA-Based Development 
for Defense and 
Aerospace Applications 

The right FPGA design toolkits not only speed development, but can simplify 
the addition of custom IP so designs can be tailored to specific applications. 

by Steve Edwards 

Curtiss-Wright Controls Embedded Computing 


-•he parallelism, speed and I/O flexibility provided by today’s 
FPGAs make it possible for system engineers to replace 
J multiple processor boards with a single FPGA COTS board. 
For the defense and aerospace market, where high performance 
frequently must be traded off with power and size/weight restric¬ 
tions, these FPGA boards offer the best of both worlds: high per¬ 
formance in a single slot. 

Formerly, FPGA solutions had the reputation of being costly, 
due to long development cycles and high development costs com¬ 
pared to traditional software-based solutions. But today’s FPGA 
device families, combined with appropriate FPGA design kits, 
help engineers get designs to market rapidly. In addition, they 
offer lower costs and greater flexibility that simplifles adding the 
system developer’s intellectual property (IP). 

One of the prime advantages of today’s FPGAs is the balance 
they provide between processing and I/O. This balanced approach 
makes FPGAs very efficient at simultaneously processing several 
high-speed, parallel data streams. Such I/O versatility means that 
multiple banks and types of high-speed memory, including DDR 
SDRAM and DDR SRAM, can be connected to an FPGA. 

Many newer families of FPGAs, such as Xilinx’s Virtex-II 
Pro and Virtex 4, feature high-speed serial transceivers, each 
capable of throughput of 3.125 Gbits/s or greater. These trans¬ 
ceivers support many of the high-speed serial interfaces used in 
emerging defense/aerospace applications, such as Serial RapidIO 
(SRIO), PCI Express (PCIe), 10 Gigabit Attachment Unit Inter¬ 
face (XAUI) and Gigabit Ethernet. 
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The combination of parallel processing with fast, synchro¬ 
nous memories and high-speed serial transceivers lets system de¬ 
signers replace multiple processing cards with a single card con¬ 
taining two or more FPGAs. This results in systems with lower 
power requirements, weight and cost. 

FPGA-Based COTS Boards 

FPGA-based COTS boards targeted to the defense/aerospace 
market often share certain common elements. For example, they 
all provide one or more high-density, high-performance FPGAs, 
high-speed serial I/O and high-performance memory. 

But hardware architecture is only part of the solution. To 
realize the full value that FPGAs can deliver, the success of an 
FPGA-based project is dependent on having the right FPGA de¬ 
sign tools. Some of these can shorten time-to-market and lower 
development costs, while also providing an open architecture that 
makes it possible to tailor the design to speciflc applications. 

Systems engineers evaluating FPGA boards must address 
certain challenges. These include how to add their algorithms 
to the FPGA and how to simulate the design at the system level 
to ensure that these algorithms work. Because FPGA develop¬ 
ment can be costly, engineers also must evaluate whether their 
approach will save time and money and if the resulting solution 
will be robust and reliable. 

To understand the importance of design tools, it is useful to 
consider how data is handled by an FPGA board. For example, 
one board (Figure 1) uses two Virtex-II PRO VP-70 or VP-100 
FPGAs, several high-speed serial interfaces and a large amount 
of high-speed memory to achieve a high-performance, reconflgu- 
rable computing engine. The combination of DDR SDRAM for 
bulk storage and DDR SRAM for fast, non-sequential storage of 
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Figure 1 


Curtiss-Wright’s CHAMP-FX architecture balances high-speed processing, memory and I/O. 


algorithm data allows flexibility in mapping algorithms to this 
architecture. 

In a typical application, data comes in via a high-speed serial 
interface and is stored in SDRAM. This data is then pulled out of 
SDRAM by the end application and processed via the customer’s 
algorithm. Intermediate results are stored in internal or external 
SRAM and the flnal result is decimated to a lower data rate. 

The FPGA Developers Kit 

For DDR SDRAMs, FPGA designers must confront the 
challenges of aligning data and data strobes, tight timing con¬ 
straints, signal integrity issues and simultaneously switching out¬ 
put (SSO) noise. In addition, certain design issues can prolong 
design cycles or force them to accept reduced performance. 

To make matters worse, all of these hurdles become more 
pronounced at high frequencies. On a read of data from SDRAM, 
the data is valid for only two to three nanoseconds. Signiflcant 
effort is required to latch it reliably inside the FPGA and then 
synchronize it with the rest of the logic there. The FPGA de¬ 
signer is thus faced with signiflcant challenges that may take sev¬ 
eral man-months to complete, but that can be solved by using the 
IP in some FPGA developers kits. 

In the ideal FPGA developers kit, all of the high-speed IP 


provided is flxed to certain regions within the FPGA (Figure 2). 
This is done to ensure that all critical paths meet timing, as well 
as to conflne the overall IP design to a small region of the chip to 
minimize logic resources. 

FPGA IP Designed Specifically for the Hardware 

Designers must also make sure that the IP works with the 
hardware. One potential challenge is the issue of SSO noise. In a 
Xilinx FPGA, when too many outputs in a particular bank switch 
at the same time, simultaneously switching output introduces 
noise into the system and causes one or more bits in a particular 
bank to flip to the wrong value. 

This can be a tricky problem to track down if the designer 
is not familiar with the SSO phenomenon and with the particular 
board hardware involved. The FPGA board supplier should test 
the board with SSO in mind, and its memory interface pinouts 
should be selected to avoid this issue. 

A DDR SDRAM read cycle provides a good example of 
how the FPGA IP and the board hardware must work together. 
In order to read data, the FPGA sends out a clock signal to the 
SDRAM and waits two clock cycles for a return four-word burst. 
The challenge is how to clock the data back into the device. Be¬ 
cause of trace delays in the PCB, there is skew between the trans- 
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An internal crossbar switch is used to connect 
user IP to IP supplied in the CHAMPtools-FX FPGA 
Developers Kit. 
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Figure 2 


To minimize routing delays and improve timing, I/O 
IP blocks are located along the edge of the FPGA 
near I/O pins. 


mitted clock signal and the incoming data. This is not an optimal 
design approach because skew minimizes the window during 
which valid data can be clocked into the FPGA. 

A better approach is to compensate for trace delays in the 
FPGA by phase shifting the internal clock relative to the incom¬ 
ing data. This allows the FPGA to clock in read data with the 



Clock 

Generation 

Model 


Simulation can check against known good 
data at any one of these places 


Figure 4 


The ability to input known data and check results is a key feature of 
an FPGA developers kit simulation environment. 


appropriate timing margins. However, to do this the FPGA IP 
designer must be intimately familiar with the PCB characteris¬ 
tics of the FPGA-SDRAM interface. 

Some argue that, to simplify the design process and reduce de¬ 
sign cycle time, designers should use memory-controller IP cores 
provided by FPGA vendors or third-party suppliers. This is a valid 
assertion if understood in the right context. FPGA IP that has been 
designed for a particular hardware platform with the end applica¬ 
tion in mind, and that has been tested and qualified for use in that 
application, will certainly save the designer time and money. 

Interfaces and IP Integration 

The IP in the FPGA developers kit is extremely important. 
For example, IP in the CHAMPtools-FX FPGA Developers Kit, 
for the Xilinx-based board described above, is 
built around one of two standard Xilinx inter¬ 
faces: IP Interconnect (IPIC) or Local Link. IP 
Interconnect is a standard memory-mapped in¬ 
terface that allows the IP block to send data to 
a particular memory address. SDRAM, SRAM 
and PCI IP are provided with IPIC interfaces. 

Local Link is a streaming data interface in 
which the data’s destination address is predeter¬ 
mined. Local Link interfaces are provided with 
DMA controllers so that engineers can set up a 
destination buffer to a memory-mapped location. 
As data comes into the Local Link interface, the 
DMA automatically sends it to the correct mem¬ 
ory address. 

Local Link interfaces usually include Rock- 
etlO IP. Both of these interfaces must be well 
documented in the FPGA developers kit. This 
documentation should describe the signals for 
each interface, as well as timing diagrams and 
anything else needed to develop custom IP using 
one of these interfaces. 

Another important part of the developers 
kit should be an IP block called the IPIC switch. 
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This block allows other IP blocks with IPIC or Local Link in¬ 
terfaces to be connected together so that data can be transferred 
among them (Figure 3). Multiple IPIC switches can be instanti¬ 
ated in the design allowing engineers flexibility in choosing the 
best solution. 

Finally, integration is easy when all of the IP can be in¬ 
tegrated within the Xilinx Embedded Developers Kit frame¬ 
work. This allows designers to easily instantiate or remove 
IP blocks and synthesize the design, using a GUI-based soft¬ 
ware framework. 

Simulation and Hardware Test 

Design veriflcation, including functional and timing veri- 
flcation, typically consumes the biggest part of the FPGA de¬ 
velopment cycle. As design complexity increases, this causes a 
dramatic increase in simulation time. Many FPGA designers rec¬ 
ommend budgeting 50% of IP development to simulation. Some 
of this time is spent simulating at the IP block level, but a good 
portion is also spent in full simulation of the entire design. This 
is the last hurdle for the FPGA designer before testing the design 
on the target hardware. 

After all of the bugs have been removed from individual 
blocks, the design needs to be put together and simulated. This 
must be done before testing in silicon. 

Simulation requires a testbench environ¬ 
ment that includes models of all of the ex¬ 
ternal interfaces. It also requires the ability 
to initialize memory interfaces, pass data 
from the PCI and RocketIO into the FPGA 
and check memory, as well as the ability to 
test PCI or RocketIO data against expected 
data previously stored in flies. The FPGA 
developers kit must provide all the features 
needed for a full functional simulation of 
the FPGA. 

Today, many FPGA algorithms are de¬ 
veloped using a tool such as Matlab, which 
enables designers to input real data into 
their algorithms and capture the results in a 
file. These input and output results can then 
be reused during simulation. 

The simulation environment provided 
by the Xilinx-based board’s developers kit 
lets engineers input data from a file into the 
simulation from the board’s external PCI or 
RocketIO interfaces. Simulation data can 
be stored to a file at any of the memories 
or the PCI or RocketIO interfaces. These 
results can then be compared to result data 
captured during the Matlab simulation, en¬ 
abling developers to verify the correctness 
of their VHDL IP model versus the math¬ 
ematical model originally created in Mat¬ 


lab. The ability to check data at any of the memory interfaces is 
particularly useful for allowing developers to check integrity at 
multiple stages within an algorithm. 

The final step is testing the design on the hardware. At this 
point the design is almost complete, but there is still the possibil¬ 
ity that the test environment did not test for all possible condi¬ 
tions, or that there is some variation between simulation and the 
actual hardware. The Xilinx-based board’s developers kit makes 
it possible to automatically insert Xilinx Chipscope IP onto ei¬ 
ther the IPIC or Local Link interface and specify which ports to 
monitor. This is an easy way to add a circuit logic analyzer func¬ 
tion to the FPGA design, especially at boundary points between 
customer IP and IP provided by the development tools vendor. 

System engineers need the right tools in their toolkit to com¬ 
plete the job well, on time and under budget. The ideal FPGA 
developers kit provides the right tools for the system engineer to 
design a robust, reliable product on time and under budget. It is 
an absolute necessity for getting the job done. □ 

Curtiss-Wright Controls Embedded Computing 
Leesburg, VA. 

(703) 779-7800. 

[www.cwcembedded.com]. 



Depend on DC-DC 
Converters from VPT 

For your military, avionics, or space program, rely 
on VPT's time-tested, mission proven line of DC-DC 
power converters and accessories. 

• From 1 to 200W with inputs from 
28V & 270V MIL-STD-704 inputs 

• MIL-STD-461 C,D,E EMI filters 

• Full operation over -55°C to +125°C 

• Modules available on DSCC SMDs 

• MIL-PRF-38534H&K, ISO 9001:2000 

• Fast delivery 
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NASA, Raytheon, U.S. Navy, U.S. Air Force and 
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power your mission today with VPT. 




Order a free product catalog today! 
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Slice Architecture Tackles Growing Thermal Demands of High 
Performance 

Themis Computer has announced its new Slice switched computing initiative, de¬ 
signed to meet the escalating demands for thermal and kinetic management imposed by 
next generation high-density/high-performance mission-critical computing. A processor-indepen 
dent architecture, the Themis “Slice” platform allows users to mix, match and manage SPARC and x86 architec 
tures, Solaris, Windows and Linux operating systems, in combination with third-party network servers, storage and switches. 

Quorum, Themis’ real-time, policy-based resource manager, ensures contracted application quality of service (QoS) for heterogeneous computing 
resources. Designed for high-density, high-performance computing, the Themis Slice architecture is suited for those who are looking for highly 
available, horizontally scalable processing power and lower life cycle cost of ownership. 



Themis Slice is offered in air and liquid cooling variants that provide thermal headroom to accommodate aggressive scaling of commercial 
microprocessor core density, speed and power. All Themis Slice elements are inherently rugged, have a uniform mechanical footprint and a stan¬ 
dard rack height (1-RU) and depth (22”). Up to five Slice elements, including a common Power Slic,e can be combined in a 5RU docking station, or 
Subrack. The Subrack blind mates with connectors on the Slice element, providing power distribution, cable management and dripless couplings for 
liquid cooling of the constituent Slice Subrack modules. The Subrack allows liquid and air-cooled Slices to be intermixed, within a single docking 
station, providing blind-mated dripless couplers to cooling liquid manifolds in the docking station. Subracks can be interconnected, using external 
switches, to configure highly scalable grid computing systems. 

Within the Subrack, Slice elements plug into an InfiniBand high-speed, low-latency serial switch fabric, so that clusters of up to five Slice 
elements can be interconnected, without external switches. Processor Slices interface with General Purpose I/O (GPIO) Slices over a PCI-Express 
high-speed serial fabric, also supported by the Subrack. Other Cluster elements include Storage Slices and a four-port Gig-E Target Channel Adapter 
CO -located in the Power Slice. 


Up to four Processor Slices with up to 64 cores can be configured in a single 5U Subrack. The cluster fabric, internal to the Subrack uses 
InfiniBand switches and links for superior (low) memory-to-memory transfer latency and scalable bandwidth. Subrack clusters can be configured 
using either Gig-E or InfiniBand. Themis Slice Subracks can be combined with IBM or Sun Servers, in heterogeneous networks, using either 
networking technology. At the Subrack level, Themis Slice technology is a truly open architecture that offers superior SWAP and environmental 
resiliency. Processor Slices are priced from $10,500. Processor Slice configurations bundled with the Subrack docking station and power supplies, 
are priced from $26,000, in OEM quantities. 

Themis Computer, Fremont, CA. (510) 252-0870. [www.themis.com]. 


2.0 GHz CPU Performance Plus Data Acquisition 
on a Single EPIC Board 

A 2.0 GHz processor teams with auto-calibrating A/D circuitry on 
a single EPIC form-factor board to conserve space and reduce cost for 
applications that need to combine a high-speed CPU with data acquisi¬ 
tion capability. These include military systems, transportation systems, 
in-vehicle control, medical instrumentation and industrial control sys¬ 
tems. The Poseidon from Diamond Systems offers the VIA Eden ULV 
or the VIA Cl processor along with USB 2.0, Gigabit Ethernet, SATA 
mass storage and the automatically calibrating A/D interface. 



The Poseidon SBC is offered with either a fanless 1.0 GHz Eden 
ULC or 2.0 GHz VIA C7—both with on-chip cache and 400 MHz front¬ 
side bus along with the VIA CX700 chipset. 
The board comes with 256 Mbyte or 512 
Mbyte 533 MHz DDR2 RAM soldered 
onto the board. The chipset inte¬ 
grates the VIA UnitChrome 
Pro 2D/3D graphics control¬ 
ler with MPEG-2 hardware 
acceleration, CRT and LVDS 
fiat panel support and dual display 
capability. Included are also four USB 
2.0 ports, four RS-232 serial ports (two 
with RS-422/485 ca- pability) and both IDE and S-ATA interfaces. 
Single-unit pricing starts at $700. 


Diamond Systems, Mountain View, CA (650) 810-2500. 
[www.diamondsystems.com]. 


Small, Low-Power Controller Board Has CAN 

In systems with a large amount of industrial I/O in tight, remote 
spaces, it can be difficult to add functionality or turn a specific function 
on or off without having to reprogram the entire system. A RoHS-com¬ 
pliant board from Micro/sys allows exactly that. 

The MCB58, a new member 
of the Micro/sys SNAP series of 
microcontroller boards, is based on 
the Ereescale HC(S)08 MCU. Mea¬ 
suring 3.55 in. X 2.65 in. x 0.5 in., 
the MCB58 typically draws a low 
300 mW while running at its fastest 
speeds and executes with a 50-nano- 
second instruction time. The board’s 
wide assortment of I/O includes a 
PC/104 slot, an RS-485 serial port, 
four isolated digital outputs, four 
isolated digital inputs and 24 additional lines of TTL-level digital I/O. 
Eilters allow four channels of PWM to act as four independent D/A con¬ 
verters. Also included are an onboard temperature detector, a real-time 
clock, a CAN interface and a 16k serial EEPROM. 

The board contains 60 Kbytes of program/data flash and 4 Kbytes 
of SRAM. Metrowerks’ CodeWarrior compiler is included in the SDK. 
Single quantity pricing is $95. 

Micro/sys, Montrose, CA. (818) 244-4600. [www.embeddedsys.com]. 
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Programmable Internet Controller for Machine- 
to-Machine Applications 

A low-cost programmable Internet Controller chip lets the devel¬ 
oper determine its pin-out and functionality by its “firmware flavor.” 
The iChip CO120SQ from ConnectOne ships from the factory with the 
basic boot block. Designers can then choose the firmware flavor that 
includes the features and Internet protocols (IP) needed for their ap¬ 
plication-specific designs. The C0120SQ’s low power consumption and 
firmware updateability make it suitable for devices such as point-of-sale 
terminals, gateways, medical devices, utility meters and handheld de¬ 
vices that use wired or wireless LANs. 

The 64-pin TQFP CO120SQ is targeted to meet machine-to-ma- 
_ chine (M2M) application connectivity needs. 
Designed for 802.11b/g WiFi and lO/lOOBaseT 
LAN access, CO120’s firmware flavors enable 
a combination of up to 10 TCP/UDP sockets; 
L two listen sockets; clients for HTTP, SMTP 
I and FTP; and the SerialNET port server 
operating mode for serial-to-IP bridging. 
A choice of locally or remotely updateable 
firmware is available, and protocol maintenance 
is free from Connect One. Firmware flavors for cellular 
and dial-up modem applications will be available in Q3. 

The iChip works with any host processor and any or no host RTOS. 
Since iChip offloads TCP/IP communications from the host application, 
minimum development time or IP expertise is required. Design-in time 
typically is one man-month. Pricing is $11.25 for over 10,000 units. De¬ 
velopment board II-EVB-511 for WiEi applications sells for $450 and 
II-EVB-501 for LAN sells for $275. 

ConnectOne, Phoenix, AZ. (408) 986-9602. [www.connectone.com]. 



Drop-in ETX Replacement Module Solves EOL 
Problems 


Several non-RoHS ETX modules used in a wide range of high-vol¬ 
ume commercial applications recently reached end-of-life (EOL) status, 
which has left stranded developers of systems based on these boards. To 
help keep OEMs in production, Ampro Computers has introduced an 
ETX module that drops into existing baseboard designs. 



Utilizing the new RoHS-compliant AMD Geode LX 800 single¬ 
chip integrated processor and Northbridge, the ETX 610 targets appli¬ 
cations ranging from building automation to voting machines. The 500 
MHz Geode LX 800 is integrated with 2D graphics, a memory controller 
and a PCI bridge. To facilitate fanless system 
designs, the CPU has a low thermal design 
power (TDP) rating of 3.9W. The module 
also contains DDR SODIMM RAM to 1 
Gbyte, as well as USB 2.0 ports, EIDE 
and Serial ATA (SATA) interfaces 
for migration, 10/100 Mbit Ethernet, 
ACPI power management, PCI ex¬ 
pansion and ISA bus expansion for custom 
circuitry on the ETX baseboard. LVDS fiat panel 
LCDs are supported. 


ETX 610 QuickStart Kits include drivers and BSPs for Windows 
XP, Windows XP Embedded and Windows CE 5.0, as well as a full 
Linux 2.6 distribution (Eedora Core 3). Modules will begin shipping 
by late May. Pricing for production volumes is in the low $200s. 


Ampro Computers, San Jose, CA. (408) 360-0200. 
[www.ampro.com]. 


Low-Power Rugged Pentium M SBC in Single 
CompactPCI Slot 

Targeted for embedded, rugged applications with low power con¬ 
sumption, the Pentium M-based CRMl from Dynatem has a Pentium 
M processor that utilizes a new micro architecture to meet the current 
and future demands of high-performance, low- 
power embedded computing, making it suitable for 
communications and industrial automation appli¬ 
cations. It features advanced branch prediction ca¬ 
pability, micro-ops fusion for improved instruction 
execution, and a dedicated hardware stack manager 
that employs hardware control for improved stack 
management. 

Onboard CompactElash permits single-slot 
booting. I/O routed to the backplane includes an 
EIDE port, two Serial ATA ports, two Gb Ether¬ 
net ports (PICMG 2.16-compatible), DVO/VGA, 
four USB 2.0 ports, two COM ports and two PMC 
expansion sites. The 855GME chipset offers inte¬ 
grated, high-performance graphics with resolutions up to 1600 x 1200. 

The CRMl complies with VITA 30.1-2002 so it comes with top and 
bottom cooling plates that are bonded to the major components through 
thermal conduction and to the heat conducting printed circuit board me¬ 
chanically. Dynatem offers board support packages for such popular op¬ 
erating systems as VxWorks, Windows NT, Windows XP, Linux, QNX 
and RTX. Pricing for the CRMl starts at $5,300 in single quantity. 

Dynatem, Mission Viejo, CA. (949) 855-3235. [www.dynatem.com]. 



High-Speed A/D I/O Card Offers Four 105 MHz 
Channels 

Getting real-world signals into the signal processing environment 
can be a big challenge. A new high-speed analog-to-digital I/O board 
from BittWare provides four 
channels of 105 MHz A/D 
conversion and a reconfigu- 
rable PPG A. 

The Tetra-PMC-H A/D 
I/O card features four high- 
performance, 14-bit wide¬ 
band A/D converters running 
at up to 105 MHz. These stream 

data directly to an Altera Cyclone II PPGA for A/D control, distribu¬ 
tion of converted data and front-end processing. Data pre-processing 
functions can be configured to enable digital filtering, decimation and 
digital down conversion. Data can be transferred to the baseboard over 
the PMC interface by the Cyclone II. 

In addition to complete software development tools that allow de¬ 
signers to easily develop application code and integrate the Tetra-PMC-H 
into their systems, BittWare offers a Tetra developer’s kit for the Cyclone 
II that includes source for the A/D converters and the link ports. The 
Tetra-PMC-H will be available in Q2 of 2006 at a list price of $3,995. 
BittWare, Concord, NH. (603) 226-0404. [www.bittware.com]. 
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3U Conduction-Cooled CompactPCI SBC for 
Severe Environments 


A new, rugged conduction-cooled 3U cPCI Single Board Computer 
(SBC) provides full system health monitoring and reporting, meet¬ 
ing all PICMG 2.9 specifications. The ROCK (CC61x) from General 
Microsystems targets extreme applications requiring high performance, 
low power consumption and operation in extreme temperatures. The 
highest-performance version is driven by the 1.4 GHz Pentium M-738 
processor with 2 Mbytes of L2 cache. The microcontroller and FPGA 
are operational below and above the stated board operating tempera¬ 
tures (-40° to -h 85°C). The ROCK is also available in a convection- 
cooled version (0° to -h50°C). 



A patent-pending design utiliz¬ 
ing a microcontroller per¬ 
forms the baseboard man¬ 
agement controller (BMC) 
functions per PICMG 2.9, 
reports the results of the built-in 
test (BIT) and extended built-in test 
(EBIT), and monitors and controls the base¬ 
board temperature. The CC61x comes with up to 
2 Gbytes of ECC memory and up to 16 Gbytes of flash memory. The 
ROCK provides dual Gigabit Ethernet with a TCP/IP offload engine on a 
PCI-X (66 MHz/64-bit) bus. This dual-channel Gigabit device provides 
true Gigabit wire speeds in full duplex mode and utilizes a fraction of 
the CPU bandwidth when fully loaded. Support is available under Win¬ 
dows XP/2000, Linux, QNX and VxWorks. Quantity 100 pricing for the 
conduction-cooled version starts at $4,370. 


General Micro Systems, Rancho Cucamonga, CA (800) 307-4863. 
[www.gms4sbc.com]. 


3U CompactPCI A/D Board Sports Eight 
Channels 


Squeezing high-speed acoustic technology used for sonar and vi¬ 
bration analysis applications into a 3U CompactPCI form-factor is no 
mean feat. But it has been done by ICS Sensor Processing in the com¬ 
pany’s new A/D converter board, to achieve substantial reductions in 
size and weight. 


The ICS-1745 high-speed acoustic A/D converter board features 
eight channels of high-frequency acoustic analog input and onboard sig¬ 
nal conditioning with programmable gain. All channels use the Analog 
Devices AD9260 16-bit, high-speed oversampled A/D converter with 
buffer memory to deliver 2.5 Msamples/s. Onboard signal conditioning 
eliminates the requirement for external signal conditioning logic. With 

a maximum bandwidth of 1.25 MHz, 
the ICS-1745 supports four input volt¬ 
age ranges (20 Vpp, 2 Vpp, 0.2 Vpp and 
0.02 Vpp differential) and 8 Mbytes 
of memory in two banks. 

The anti-alias cut-off filter 
frequency is fixed at 1.25 MHz 
standard, and other frequencies can be 
supplied on request. Differential analog input 
is provided via the front panel, while a second front- 
panel connector makes possible multiple board synchroni¬ 
zation for systems requiring high channel counts. Windows and Linux 
drivers are available. Price is $6,235 in OEM quantities. 



Interactive Circuits & Systems, Part of Radstone Embedded 
Computing. Ottawa, Canada. (613) 749-9241. [www.ics-ltd.com]. 


High-Density Digital and Analog I/O Card For 
Industrial Apps 

A high-density PC/104 I/O card includes a 16-channel, 12-bit A/D 
converter, an 8-channel D/A converter and 48 lines of digital I/O—and 
it operates in a temperature range from -40° to -h 80°C. The PCM-MIO 
from Winsystems requires no trimpots for 
calibration of the analog circuitry to 
remain within specs. The ability to do 
without adjustment with potentiom¬ 
eters results in quick and easy setup of 
analog systems needing accurate digi¬ 
tally controlled voltages—this also 
eliminates the need for a field techni¬ 
cian to perform the setup. 

The analog section of the board is 
designed with ultra low-noise power sup¬ 
plies and a low-drift voltage reference as well 
as with special layout techniques and low-drift resistor networks 
to maintain long-term accuracy and stability. The PCM-MIO is compat¬ 
ible with isolated signal conditioners that will protect, filter and isolate the 
analog input and output signals from electrical transients. All 48 digital 
I/O lines are individually programmable for input, output or output with 
read-back. The lines are TTL-compatible and can source and sink 12 mA, 
which allows direct connection to industry standard, optically isolate AC 
and DC signal conditioners. Drivers are available for Linux, Windows CE 
and Windows XP Embedded. Pricing starts at $395. 

WinSystems, Arlington, TX. (817) 548-1358. [www.winsystems.com]. 



SBC for Mobile Apps Has 5 Low-Power Modes 

Designers of mobile high-performance embedded systems, such 
as handheld and wearable computers or small unmanned vehicles, 
need to be able to fine-tune power consumption. The new BitsyXb 
SBC from Applied Data Systems features five low-power modes to 
help make the SBC power-stingy, as well as dynamic variable speed 
and voltage regulation. 

The compact, 3 in. x 5 in. 

BitsyXb is based on Intel’s 32-bit, 

520 MHz XScale PXA270 CPU, 
with a video interface up to XGA 
resolution. Up to 128 Mbytes of 
SDRAM program memory and 
up to 64 Mbytes of flash mem¬ 
ory are provided. 128 Kbytes of EPROM is included as a boot device. 
Eor expansion and connectivity, the board has a PCMCIA Type II inter¬ 
face, three serial ports, a USB port, an Intel QuickCapture camera sen¬ 
sor input bus, 10 digital I/Os, an SPI port, an I^C bus and ADSmartIO 
with nine configurable inputs/outputs. 

An onboard power supply has an input voltage of 5V or 6-16V. The 
board consumes less then IW during operation and is ruggedized at -45° 
to -h 85°C. Windows CE .NET and Linux are supported. The BitsyXb 
SBC is priced in the $300s. 

Applied Data Systems, Columbia, MD. (301) 490-4007. 
[www.applieddata.net]. 
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Core Module Integrates Wired/Wireless 
Networking 

As wireless devices of all kinds are increasingly connected to wired 
networks, design engineers are challenged with finding a single, secure 
solution for both. A new embedded core module from Digi Interna¬ 
tional integrates 10/100 Mbit Ethernet with secure 802.11a/ 
b/g wireless LAN networking capabilities. 

Powered by Digi’s 155 MHz 
ARM9-based NetSilicon NS9360 
processor, the ConnectCore 
Wi-9C provides up to 
256 Mbytes of integrated 
SDRAM/flash memory in 
a compact SO-DIMM form- 
factor. For I/O connectivity, the Con¬ 
nectCore Wi-9C features USB, UART, 

I^C, SPI, PWM and GPIO interfaces. Wireless security protocols sup¬ 
ported include WFP, WPA and WPA2/802.11i. 

The ConnectCore Wi-9C is pre-certified, eliminating costly certi¬ 
fication delays during product development. It is also RoHS-compliant. 
Operating temperature range is -40° to -h 85°C. Development kits for 
NET-fWorks, Linux and Windows CE are available. The ConnectCore 
Wi-9C will be available in Q3 2006 starting at $149 each in quantities 
of 1,000. 

Digi International, Minnetonka, MN. (952) 912-3444. [www.digi.com]. 



Graphical Design Platform Available for ADI’s 
Blackfin Processor 

Engineers who are domain experts often want to develop appli¬ 
cations using a single graphi¬ 
cal platform all the way 
from algorithm design and 
prototyping to deployment 
and test. Now they can do this 
for applications based on An¬ 
alog Devices’ Blackfin pro¬ 
cessors. National Instruments 
has extended the LabVIEW 
graphical dataflow develop¬ 
ment environment by releasing 
the LabVIEW Embedded Module for ADI Blackfin processors. 

The software features more than 140 Blackfin-specific, hand-op¬ 
timized math, analysis and signal processing functions. Integrated I/O 
such as audio and video D/A converters, A/D converters and codecs are 
provided, as well as on-chip debugging and easy graphical interconnec¬ 
tion via Ethernet. The module includes the ADI VisualDSP-H-H C devel¬ 
opment and debugging environment for low-level access and real-time, 
interactive debugging and deployment directly to Blackfin. 

Engineers can debug code graphically in LabVIEW or simultane¬ 
ously debug both the graphical code and generated C source code. The 
module ships with application examples such as audio, control, power 
monitoring and communications and provides easy connectivity to NI 
test and measurement hardware. Pricing starts at $6,995. 

National Instruments, Austin, TX. (512) 683-0100. 

[www.ni.com]. 
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fjp.RocttJorVMEbus and Com pact PC I® 

im NEW Product! _ 



DVD-RW / CD-RW / CDROM 

in VMlbui lurm EjcLor 
with UIIthI VVkie SCSI LVD inl y lace. 


^ See tho full line of VMEbus single and 
mylti-sloE mass storage mndule products nt 

nr call TnlLFree: finn^fina-7S37 
Red Rock TechnoIngiesH Inc. — 4a{)-4a3'3777 
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Data Acquisition 

O 2 ch 210M«ps 12 bit MD 
O 2 ch leOMsps 1& bit D/A 
O Ccxifigu ratals digital IfOa 
O Vlrtax^irFPGA fl.fiM or 
O USB 2 dr Standakina 


Configurable 
Digital IfO 

^VirteX'll FPOA 1M gates 
^90 Configu ratals Digital I/Os 
iaRS232r486 and LVPECL 
lauSB 2 or dtandalone 


Programmable hardware with cables, device drivers^ loading 
examples and Power 5upp1 y. 

Systems can be u&ad connected to a PC using USB. or can 
fururtion standalone {without U3BJ using thw initialisation 
PROMs. 

salas@buiitenpg.co.uk, +44 {0|1273 TfiDlSB 
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Real-Time & Embedded 
computing conference 


Evaluate Your Options 
Register now at www.rtecc.com 


Enter a World of Embedded Computing Soiutions 

Attend open-door technical sessions especially designed for those developing computer systems and 
time-critical applications. Get ahead with sessions on Embedded Linux, VME, PCI Express, ATCA, DSP, 
FPGA, Java, RTOS, SwitchFabric Interconnects, Windows, Wireless Connectivity, and much more. 


Your Resource Opportunity 

Exhibits arranged in a unique setting to talk face-to-face with technical experts. Table-top exhibits 
make it easy to compare technologies, ask probing questions and discover insights that will make a 
big difference in your embedded computing world. Join us for this complimentary event! 


Sponsors 

JSASOfi 




cnvTriTy 


... 

Green Hills 


LYNUXN^KS™ 

/ n.-LSiniT 

f^BS 

\ Technologies. 

V^DL WIND RlVETl 
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2006 Locations 




Dallas 

June 6 

Houston 

June 8 

Boulder 

June 20 

Salt Lake City 

June 22 

Detroit 

Los Angeles 

Taipei 

Dublin 

Toronto 

Lucerne 

Washington DC 

Montreal 

Calgary 

Eindhoven 

Shenzhen 

Ottawa 

Vancouver BC 

Bristol 

Shanghai 

Seattle 

San Diego 

Patuxent River 

Beijing 

Portland 
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Pentium M PC/104-Plus Single Board Computer 

A new Pentium M-based single board computer is targeted at ap¬ 
plications requiring substantial processing power and extensive features 
in a compact design, such as medical, avionics, navigation/tracking, sys¬ 
tem monitoring and security/homeland defense markets. The Cheetah 
from VersaLogic is suited for embedded control applications requiring a 
very small footprint, which the 3.6” x 3.8” PC/104-Plus board provides. 
Standard onboard features include two COM ports, two USB 2.0 ports, 
Ethernet, IDE, LPT, audio and PS/2 keyboard/mouse support. The board 
also features integrated high-performance video output with support for 
both analog monitors and LVDS flat panels. The Extreme Graphics 2 
video processor includes high-speed 3D rendering, full-motion video 
and MPEG-2 decoding. 



The PC/104-Plus interface supports both ISA 
and PCI add-on modules. Standard pass¬ 
through connectors allow the board to be 
stackable with other PC/104 modules. 

It may also be used as a CPU mod¬ 
ule for a larger system by plugging it 
into a proprietary base board that in¬ 
cludes specific user I/O circuitry. The 
Cheetah also includes a customizable, 
OEM-enhanced BIOS that is field-up¬ 
gradeable. It is designed to work with 
embedded operating systems, including 
Windows CE/XP/XPe, Linux, VxWorks, QNX, DOS and other 
real-time OSs. Pricing is about $1,500 in OEM quantities. 


VersaLogic, Eugene, OR. (541) 485-8575. 
[www.VersaLogic.com]. 


ETX COM Express Module Sports Intel Core Duo 

The new PICMG COM Express standard for computer-on-module 
(COM) form-factors promises higher performance and I/O bandwidth 
in a much smaller space. A new COM from Kontron, the ETXexpress- 
CD, leverages the Intel Core Duo processor to provide extremely high 
performance for the COM Express Basic Eorm Eactor modules. 



The 95 mm x 125 mm ETXexpress-CD includes the Mobile In¬ 
tel 945GM chipset and the ICH7M Southbridge to deliver up to 2x 2 
GHz processor performance. It supports up to 2 Gbytes 
of DDR2-SDRAM. Up to five PCI Express 
xl lanes and a 32-bit/33 MHz PCI 
bus are included, as well as a Giga¬ 
bit Ethernet port, a PCI Express 
Graphic xl6 lane, two Serial ATA 
and one Parallel ATA interfaces and 
eight USB 2.0 ports. CRT and LVDS 
output are provided to drive high-resolu¬ 
tion monitors and displays. 

Windows XP, Windows XP Embedded, 
Windows 2000 and Linux are supported. The 
ETXexpress-CD is RoHS-compliant. Prices start at $800, depending on 
processor speed. 


Kontron America, Poway, CA. (888) 294-4558. [www.kontron.com]. 
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ETXexpress-CD 
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ETXexpTess products n«xt geireration em bedded modules 
b^sed on the PICMG COM Express sUnd^rd. ETXexpress provides 
the hightest perfQrmance and I/O bandwidth available in COMs. 

PCI Express-ti^eetemental data path 

> Gigabit Ethernet - for high connectMt^ 

> USB 2.0-for fast perlpherv 

> Serial ATA-for fast drives 

^ ACPI-for aptimized power management 

Get ready. Get ETXexpress 
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Connected with technology and companies providing solutions now 

liX \ -Connected is a new resource for further exploration into products, technologies and companies. Whether your goal is to research the latest 

- datasheet from a company, speak directly with an Application Engineer, or jump to a company's technical page, the goal of Get Connected is to put you 

> in touch with the right resource. Whichever level of service you require for whatever type of technology. 

Get Connected will help you connect with the companies and products you are searching for. 

www.rtcmagazine.com/getconnected 


Company 


Page 


Website Company 


Page Website 


ARCOM. 

Artesyn Technologies. 

BitMicro Networks, Inc. 

Critical I/O. 

Diversified Technology. 

ELMA Electronic, Inc. 

Embedded Planet. 

GE Fanuc Embedded Systems. 

Hunt Engineering Ltd. 

Kontron America. 

KUKA Controls GmbH. 

Lippert Embedded Computers. 
Mercury Computer Systems.... 

Microsoft Tour. 

Microsoft Windows Embedded 

National Instruments. 

Octagon Systems. 


.32.www.arcom.com 

.13.www.artesyn.com 

.6.www.bitmicro.com 

.37.www.criticalio.com 

.7.www.dtims.com 

.26.www.elma.com 

.27.www.embeddedplanet.com 

.8.www.gefanuc.com/embedded 

.75.www.hunt-rtg.com 

19,77,80.www.kontron.com 

.39.www.kuka-controls.de 

.42.www.lippert-at.com 

.33.www.mercury.com 

.43.www.microsoftembedded.com 

.29.www.learnaboutembedded.com 

.17.www.ni.com 

.... 2,3.www.octagonsystems.com 


One Stop Systems.41 

Performance Technologies.23 

Phoenix International.6 . 

QNX Software Systems, Ltd.11 

Real-Time & Embedded 

Computing Conference.76 

Real-Time Innovations, Inc.20 

Red Rock Technologies, Inc.75 

RICOH Electronics, Inc.21 

SBE, Inc.40 

SBS Technologies.4 . 

Teligy.28 

Thales Computers.34 

Trenton Technology.22 

Ultimate Solutions.35 

VadaTech.79 


...www.onestopsystems.com 

.www.pt. com 

.www.phenxint.com 

.www.qnx.com 

.www.rtecc.com 

.www.rti.com 

.www.redrocktech.com 

.www.rei.ricoh.com 

.www.sbei.com 

.www.sbs.com 

.www.teligy.com 

..www.thalescomputers.com 

www.trentontechnology.com 

.www.ultsol.com 

.www.vadatech.com 


FPGAs: The News Matrix for Design Advertiser index 


Company Page Website 

Acromag.51.www.acromag.com 

Aeroflex Microelectronic Solutions.54.www.aeroflex.com/radhardfpga 

Altera Corporation.44.www.altera.com 

Annapolis Micro Systems.55.www.annapmicro.com 

BittWare,.62.www.bittware.com 


Company 

Interactive Circuits and Systems 

Synplicity, Inc. 

Technobox, Inc. 

Themis Computer. 

VPT, Inc. 


Page Website 

....63.www.ics-ltd.com 

....56.www.synplicity.com 

....46.www.technobox.com 

....67.www.themis.com 

....71.www.vpt-inc.com 


RJC (lssn#1092-1524) magazine is published monthly at 905 Calle Amanecer, Ste. 250, San Clemente, CA 92673. Periodical postage paid at San Clemente and at additional mailing offices. 
POSTMASTER: Send address changes to RTC, 905 Calle Amanecer, Ste. 250, San Clemente, CA 92673. 

78 p;!an May 2006 




























































































SHHRE VOUR VISION 





Share your vision with us. We’ll customize 
our products to your requirements or partner 
with you to develop custom 
products all the way 
through deployment. 
Either way, you’ll 
get leading-edge 
board level 


solutions for the most demanding 
embedded applications. Tell us what 
you need at info@vadatech.com. 

Comprehensive Hardware/Software 
solution for Intelligent Peripheral 
Management Interface (IPMI) version 2.0 

• Symmetric Multi-Processing CPU for 
AMCA/ME Modules 

• A complete line of ATCA 
and AMC carriers 

• High dynamic range A/D Converters 



www. vadatech . co m 702.896.3337 


Kipsrlron QJiil die KfflitnJ« onf leytslrntjl |j^£i}nuil«si at Kon^ua Afi- Alt Jltwr 
ui: tlK- [fniikrTl}i' dtf Aim' m^ArlkiiV ?MmCt9L -L'^ZiKKi KimbnMi ATncrica, ]eK. 



The power of duo 


dual core 

OPEN MODULAR SOLUTIONS 



CP6012 GU CPCI processor blade 

> Scalable up to 2.0GHz Intel* Core™ Duo, FSB 667MHz 

> Up to 4GB Dual Channel Memory DDR2 4CDMHz 

> 4x GbE ports (Zx Front 2x RICMGZ.16) 

> PMC or XMC^lot; CompactFlash and SATA 2-5" HDD 



CP307 3U CPCI processor blade 

> Scalable up to 2.0GHz Intel* Core™ Duo, FSB 667MHz 

> Up to 4GB Dual Channel Memory DDR2 667MHz 

> 2x GbE, up to 6x USBZ.O ports^ VGA/DVI interface 
>■ 2x SATA, onboard Compaf±Flash 


Greater Capacity. Same Footprint. So Much More Potential. 

Explore the power and the potentral of two coresin one processor with Kontron CompactPCI boards designed with 
the InteL® Core^" Duo processor. Nearly double your processing power and achieve the unprecedented versatility to 
run different applications and OSes on each core using VT-x virtualization technology, With so much packaged into 
a single slot footprint, it's easy to imagine a whole new world of embedded possibilities. 

Embed your next system application with Kontron using Intel® Core™ Duo processors. 

WWW. kontron, CO m/o pen 
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